Navigate

Home
Learn
Analysis Approach: Deriving New Columns in Excel

Analysis Approach: Deriving New Columns in Excel

$$/$$

In the previous segment, you learnt about the five simple patterns through which you can classify any insight into. Now you’ll learn the methods through which you can actually generate those insights.

$$/$$

As explained by Anand above, the five patterns are sufficient in categorising the insights. But, the dataset that you might have may not necessarily be in the format where you would be applying the patterns directly. Therefore, you also need to learn a few techniques to analyse your data so that you can apply the five patterns on them and start generating insights. These analyses can be categorised into two types: exploratory data analysis and hypothesis-driven analysis. Furthermore, these analysis techniques utilise 2 methods of data manipulation in order to extract insights - creating new columns and reducing the number of rows.

Let’s start with the first method through which you can analyse the data to extract insights - by deriving new columns

$$/$$

The first method through which you can analyse the data is by deriving new columns. For example, if you have a dataset, where only the revenue and cost information is available, you can go ahead and create a separate column where you would be calculating the profit and applying the five patterns on them. As discussed in the video, some of the most common ways of adding new columns to the data are:

Metadata Lookup: The metadata is essentially an additional dataset or a data-sheet that is available with you that provides information on the original data. In the example mentioned by Anand, the population of the country was available in a separate sheet and you performed a VLOOKUP to create that column in the original dataset.
Calculations: You can perform a variety of calculations using the numeric columns in your dataset. For example, in the video above, you created a new column named Suicide Rate (%) from the Suicides and the Population columns.
Binning: This process essentially bins a given numeric column to specific categories. In the above example, you converted the Suicide Rate into specific bins and categorised them as High, Medium or Low
Business-Specific Metrics: This part would be specific to your domain and hence the metrics or KPIs that you might be using would be a useful additional column.

These methods may not seem exhaustive as a variety of analysis procedures can be used to derive new columns. But they more or less encompass a standard way through which you should proceed once you have the data with you to check for new insights.

Another nifty way of creating new columns and the one which is heavily used right now is through the use of machine learning. Watch the following video to understand the different ways in which we can derive new information from a given dataset.

$$/$$

As explained above, the various machine learning techniques that you can use to create new columns are as follows.

Classification
Clustering
Time Series Analysis
Feature Extraction
Sentiment Analysis

(Note: You would be learning some of these techniques later in the course)

To summarise, there are broadly two techniques through which you can create new columns - By performing calculations and through models- either statistical or machine learning. The statistical models often take a sample of the original data and infer from it the behaviour of the entire population whereas in machine learning models you run algorithms on a set of predefined data called "train data" to formulate the model and then run it again on another set of data called "test data" to test the model's accuracy and precision. If some of the terms in the previous sentence seem like some kind of jargon to you, then don't worry. You'll be learning these concepts in detail in the next two courses. For the time being, a cursory understanding of the difference is sufficient.

After deriving these new columns, you can go ahead and apply the five patterns that you learnt earlier to generate insights. In the next segment, you’ll learn about the other way of analysing the given data- that is through summarising the rows.