COURSES
MBAData Science & AnalyticsDoctorate Software & Tech AI | ML MarketingManagement
Professional Certificate Programme in HR Management and AnalyticsPost Graduate Certificate in Product ManagementExecutive Post Graduate Program in Healthcare ManagementExecutive PG Programme in Human Resource ManagementMBA in International Finance (integrated with ACCA, UK)Global Master Certificate in Integrated Supply Chain ManagementAdvanced General Management ProgramManagement EssentialsLeadership and Management in New Age BusinessProduct Management Online Certificate ProgramStrategic Human Resources Leadership Cornell Certificate ProgramHuman Resources Management Certificate Program for Indian ExecutivesGlobal Professional Certificate in Effective Leadership and ManagementCSM® Certification TrainingCSPO® Certification TrainingLeading SAFe® 5.1 Training (SAFe® Agilist Certification)SAFe® 5.1 POPM CertificationSAFe® 5.1 Scrum Master Certification (SSM)Implementing SAFe® 5.1 with SPC CertificationSAFe® 5 Release Train Engineer (RTE) CertificationPMP® Certification TrainingPRINCE2® Foundation and Practitioner Certification
Law
Job Linked
Bootcamps
Study Abroad
Master of Business Administration (90 ECTS)Master in International Management (120 ECTS)Bachelor of Business Administration (180 ECTS)B.Sc. Computer Science (180 ECTS)MS in Data AnalyticsMS in Project ManagementMS in Information TechnologyMasters Degree in Data Analytics and VisualizationMasters Degree in Artificial IntelligenceMBS in Entrepreneurship and MarketingMSc in Data AnalyticsMS in Data AnalyticsMS in Computer ScienceMaster of Science in Business AnalyticsMaster of Business Administration MS in Data ScienceMS in Information TechnologyMaster of Business AdministrationMS in Applied Data ScienceMaster of Business Administration | STEMMS in Data AnalyticsM.Sc. Data Science (60 ECTS)Master of Business AdministrationMS in Information Technology and Administrative Management MS in Computer Science Master of Business Administration MBA General Management-90 ECTSMSc International Business ManagementMS Data Science Master of Business Administration MSc Business Intelligence and Data ScienceMS Data Analytics MS in Management Information SystemsMSc International Business and ManagementMS Engineering ManagementMS in Machine Learning EngineeringMS in Engineering ManagementMSc Data EngineeringMSc Artificial Intelligence EngineeringMPS in InformaticsMPS in Applied Machine IntelligenceMS in Project ManagementMPS in AnalyticsMS in Project ManagementMS in Organizational LeadershipMPS in Analytics - NEU CanadaMBA with specializationMPS in Informatics - NEU Canada Master in Business AdministrationMS in Digital Marketing and MediaMS in Project ManagementMSc Sustainable Tourism and Event ManagementMSc in Circular Economy and Sustainable InnovationMSc in Impact Finance and Fintech ManagementMS Computer ScienceMS in Applied StatisticsMaster in Computer Information SystemsMBA in Technology, Innovation and EntrepreneurshipMSc Data Science with Work PlacementMSc Global Business Management with Work Placement MBA with Work PlacementMS in Robotics and Autonomous SystemsMS in Civil EngineeringMS in Internet of ThingsMSc International Logistics and Supply Chain ManagementMBA- Business InformaticsMSc International ManagementMBA in Strategic Data Driven ManagementMSc Digital MarketingMBA Business and MarketingMaster of Business AdministrationMSc Digital MarketingMSc in Sustainable Luxury and Creative IndustriesMSc in Sustainable Global Supply Chain ManagementMSc in International Corporate FinanceMSc Digital Business Analytics MSc in International HospitalityMSc Luxury and Innovation ManagementMaster of Business Administration-International Business ManagementMS in Computer EngineeringMS in Industrial and Systems EngineeringMSc International Business ManagementMaster in ManagementMSc MarketingMSc Business ManagementMSc Global Supply Chain ManagementMS in Information Systems and Technology with Business Intelligence and Analytics ConcentrationMSc Corporate FinanceMSc Data Analytics for BusinessMaster of Business AdministrationBachelors in International ManagementMS Computer Science with Artificial Intelligence and Machine Learning ConcentrationMaster of Business AdministrationMaster of Business AdministrationMSc in International FinanceMSc in International Management and Global LeadershipMaster of Business AdministrationBachelor of Business
For College Students

Grouping and Pivoting in Python

$$/$$

Grouping and aggregation are some of the most frequently used operations in data analysis, especially while performing exploratory data analysis (EDA), where comparing summary statistics across groups of data is common.
 

As an example, in the retail sales data that we are working with, you may want to compare the average sales of various regions or compare the total profits of two customer segments.

 

Grouping analysis can be thought of as having three parts, namely:

  • Splitting the data into groups (e.g., groups of customer segments, product categories, etc.)
  • Applying a function to each group (e.g., the mean or total sales of each customer segment)
  • Combining the results into a data structure showing summary statistics

 

Let us now learn how to perform grouping over the Pandas dataframes using the same dataset as before. Download the notebook provided below for this segment. 

$$/$$

The groupby() function returns a Pandas object, which can be used further to perform the desired aggregate functions.

$$/$$

Let’s take a look at another aggregate function being applied on the groupby object created above.

$$/$$

One point to note here is that if you use the groupby command on an index, then you will not face any error while executing the grouping and aggregation commands together. However, when grouping on columns, you should first store the dataframe and then run the aggregate functions on the new dataframe. 

$$/$$

Using grouping, you can easily summarise the data stored in different columns. Let’s take a look at another method to summarise the dataframes, namely, Pivots.
 

Pivoting

A pivot table is a very useful tool to represent a dataframe in a structured and simplified manner. It acts as an alternative to the groupby() function in Pandas. Pivot tables provide Excel-like functionalities to create aggregate tables. 
 

Let’s watch the following video and learn how to create pivot tables in Python using Pandas.

$$/$$

You can use the following command to create pivot tables in Pandas:

df.pivot(columns='grouping_variable_col', values='value_to_aggregate', index='grouping_variable_row')

 

Let’s take a look at another example of pivoting.

$$/$$

Using the pivot_table() function, you can also specify the aggregate function that you would want Pandas to execute over the columns provided. It could be the same or different for each column in the dataframe.

df.pivot_table(values, index, aggfunc={'value_1': np.mean,'value_2': [min, max, np.mean]})

 

The function above, when substituted with proper values, will result in a mean value of value_1 and three values (minimum, maximum and a mean of value_2) for each row.

$$/$$

In the next segment, you will learn how to deal with multiple dataframes.