COURSES
MBAData Science & AnalyticsDoctorate Software & Tech AI | ML MarketingManagement
Professional Certificate Programme in HR Management and AnalyticsPost Graduate Certificate in Product ManagementExecutive Post Graduate Program in Healthcare ManagementExecutive PG Programme in Human Resource ManagementMBA in International Finance (integrated with ACCA, UK)Global Master Certificate in Integrated Supply Chain ManagementAdvanced General Management ProgramManagement EssentialsLeadership and Management in New Age BusinessProduct Management Online Certificate ProgramStrategic Human Resources Leadership Cornell Certificate ProgramHuman Resources Management Certificate Program for Indian ExecutivesGlobal Professional Certificate in Effective Leadership and ManagementCSM® Certification TrainingCSPO® Certification TrainingLeading SAFe® 5.1 Training (SAFe® Agilist Certification)SAFe® 5.1 POPM CertificationSAFe® 5.1 Scrum Master Certification (SSM)Implementing SAFe® 5.1 with SPC CertificationSAFe® 5 Release Train Engineer (RTE) CertificationPMP® Certification TrainingPRINCE2® Foundation and Practitioner Certification
Law
Job Linked
Bootcamps
Study Abroad
Master of Business Administration (90 ECTS)Master of Business Administration (60 ECTS)Master in Computer Science (120 ECTS)Master in International Management (120 ECTS)Bachelor of Business Administration (180 ECTS)B.Sc. Computer Science (180 ECTS)MS in Data AnalyticsMS in Project ManagementMS in Information TechnologyMasters Degree in Data Analytics and VisualizationMasters Degree in Artificial IntelligenceMBS in Entrepreneurship and MarketingMSc in Data AnalyticsMS in Data AnalyticsMaster of Science in AccountancyMS in Computer ScienceMaster of Science in Business AnalyticsMaster of Business Administration MS in Data ScienceMS in Information TechnologyMaster of Business AdministrationMS in Applied Data ScienceMaster of Business AdministrationMS in Data AnalyticsM.Sc. Data Science (60 ECTS)Master of Business AdministrationMS in Information Technology and Administrative Management MS in Computer Science Master of Business Administration MBA General Management-90 ECTSMSc International Business ManagementMS Data Science MBA Business Technologies MBA Leading Business Transformation Master of Business Administration MSc Business Intelligence and Data ScienceMS Data Analytics MS in Management Information SystemsMSc International Business and ManagementMS Engineering ManagementMS in Machine Learning EngineeringMS in Engineering ManagementMSc Data EngineeringMSc Artificial Intelligence EngineeringMPS in InformaticsMPS in Applied Machine IntelligenceMS in Project ManagementMPS in AnalyticsMS in Project ManagementMS in Organizational LeadershipMPS in Analytics - NEU CanadaMBA with specializationMPS in Informatics - NEU Canada Master in Business AdministrationMS in Digital Marketing and MediaMS in Project ManagementMaster in Logistics and Supply Chain ManagementMSc Sustainable Tourism and Event ManagementMSc in Circular Economy and Sustainable InnovationMSc in Impact Finance and Fintech ManagementMS Computer ScienceMS in Applied StatisticsMS in Computer Information SystemsMBA in Technology, Innovation and EntrepreneurshipMSc Data Science with Work PlacementMSc Global Business Management with Work Placement MBA with Work PlacementMS in Robotics and Autonomous SystemsMS in Civil EngineeringMS in Internet of ThingsMSc International Logistics and Supply Chain ManagementMBA- Business InformaticsMSc International ManagementMS Computer Science with AIML ConcentrationMBA in Strategic Data Driven ManagementMaster of Business AdministrationMSc Digital MarketingMBA Business and MarketingMaster of Business AdministrationMSc Digital MarketingMSc in Sustainable Luxury and Creative IndustriesMSc in Sustainable Global Supply Chain ManagementMSc in International Corporate FinanceMSc Digital Business Analytics MSc in International HospitalityMSc Luxury and Innovation ManagementMaster of Business Administration-International Business ManagementMS in Computer EngineeringMS in Industrial and Systems EngineeringMSc International Business ManagementMaster in ManagementMSc MarketingMSc Business ManagementMSc Global Supply Chain ManagementMS in Information Systems and Technology with Business Intelligence and Analytics ConcentrationMSc Corporate FinanceMSc Data Analytics for BusinessMaster of Business AdministrationMaster of Business Administration 60 ECTSMaster of Business Administration 90 ECTSMaster of Business Administration 60 ECTSMaster of Business Administration 90 ECTSBachelors in International Management
For College Students

Bivariate Analysis on categorical variables using Excel

$$/$$

So far, you have learnt how to analyse the relationship between two continuous variables. You will now learn how to quantify and understand the relationship between pairs of categorical and continuous variables.

 

Let’s understand this from Anand.

$$/$$

The categorical bivariate analysis is essentially an extension of the segmented univariate analysis to another categorical variable. In segmented univariate analysis, you compare metrics such as ‘mean of X’ across various segments of a categorical variable, e.g., mean marks are higher for students whose mothers' education is of ‘degree and above’ level; or the median income of educated parents is higher than that of uneducated parents, etc.

 

In the categorical bivariate analysis, you extend this comparison to other categorical variables and ask — is this true for all categories of another variable, say, men and women? Take another categorical variable, such as state, and ask — is the median income of educated parents higher than that of uneducated ones in all states?

 

Thus, you are drilling down into another categorical variable and getting closer to the true patterns in data. In fact, you may also go to the next level and ask — is the median income of educated parents higher than that of uneducated ones (variable 1) in all states (variable 2) for all age groups (variable 3)? This is what you may call ‘trivariate analysis’, and though it gives you a more granular version of the truth, it gets a bit complex to make sense of and explain to others (and hence it is not usually done in EDA).

Thus, remember that performing segmented univariate analysis may deceive you into thinking that a certain phenomenon is true without asking the question — is it true for all sub-populations or is it true only when you aggregate information across the entire population?

 

So, in general, there are two fundamental aspects of analysing categorical variables:

  1. To see the distribution of two categorical variables. For example, if you want to compare the number of boys and girls who play games, you can make a ‘cross table’ as given below:

Fig-1: Cross Table
$$/$$

From this table, first, you can compare boys and girls across a fixed parameter of ‘play games’, e.g., a higher number of boys play games every day than girls, a higher number of girls never play games compared with boys, etc. And second, you can compare the parameter of ‘play games’ across a fixed value of gender, e.g., most boys play every day and very few play once a month or never.

 

  1. To see the distribution of two categorical variables with one continuous variable. For example, you saw how a student’s percentage in science is distributed based on the father’s occupation (categorical variable 1) and the income level (categorical variable 2).

 

Usually, you do not analyse more than two variables at a time, though there are ways to do that. Machine learning models are essentially a way to do that, some of which you will learn in the next course.