COURSES

MBA (Global)MBA (Global) | Liverpool Business School MBA (Global)Master of Business Administration PGDM with Specialisation in GenAI MBA (Global) Post Graduate Diploma in Management Advanced General Management Program Master of Business Administration Master of Business Administration(with Career Acceleration Program by upGrad) Post Graduate Program in General Management Professional Certificate in Global Business Management Advanced Certificate in Management On-campus MBA MBA (Global) | Deakin Business School Generative AI for Marketing Professionals Program | upGrad

Data Science & Analytics

Post Graduate Programme in Data Science & AI (Executive)Master of Science in Data Science - LJMU Executive PG Programme in Business Analytics Post Graduate Certificate in Data Science & AI (Executive)Post Graduate Diploma Program in Data Science Executive PG Program in Data Science and Machine Learning Business Analytics Certification Program Graduate Certificate Program in Data Science & AI Professional Certificate Program in Data Science and Business Analytics Global Master Certificate - Business Analytics Master of Science in Business Analytics Data Science & Analytics Bootcamp

Doctor of Business Administration (DBA) Doctorate of Business Administration Doctor of Business Administration in Emerging Technologies Doctor of Juridical Science (SJD)

Software & Tech

Master of Science in Computer Science Master of Science in Computer Science Full Stack Software Development Bootcamp Cloud Computing Bootcamp UI/UX Design Bootcamp ITIL® 4 Foundation Certification Training Python Programming Certification Training Angular Training React JS Training Certified Ethical Hacking Course (CEH v12)AWS Certified Solutions Architect - Associate Training AWS Cloud Practitioner Essentials Certification Training Azure Solution Architect Certification (AZ-305T00-A)Azure Administrator Certification (AZ-104)Azure Data Engineering Certification Training (DP-203T00)Advanced Full Stack Development Bootcamp

Master of Science in Machine Learning & AI Post Graduate Programme in Machine Learning & AI (Executive)Executive PG Program in Data Science and Machine Learning Post Graduate Diploma in Machine Learning Advanced Certificate Program in GenerativeAI MS in Full Stack AI and ML - 100% on-campus Post Graduate Certificate in Machine Learning & NLP (Executive)Post Graduate Certificate in Machine Learning & Deep Learning (Executive)AI & Machine Learning Bootcamp: Master the Future of Technology

Advanced Certificate in Digital Marketing and Communication Advanced Certificate in Marketing Leadership Development (The CMO Program)Advanced Certificate in Brand Communication Management Job-ready Program in Digital Marketing

Professional Certificate Programme in HR Management and Analytics Post Graduate Certificate in Product Management Executive Post Graduate Program in Healthcare Management Executive PG Programme in Human Resource Management MBA in International Finance (integrated with ACCA, UK)Global Master Certificate in Integrated Supply Chain Management Advanced General Management Program Management Essentials Leadership and Management in New Age Business Product Management Online Certificate Program Strategic Human Resources Leadership Cornell Certificate Program Human Resources Management Certificate Program for Indian Executives Global Professional Certificate in Effective Leadership and Management CSM® Certification Training CSPO® Certification Training Leading SAFe® 5.1 Training (SAFe® Agilist Certification)SAFe® 5.1 POPM Certification SAFe® 5.1 Scrum Master Certification (SSM)Implementing SAFe® 5.1 with SPC Certification SAFe® 5 Release Train Engineer (RTE) Certification PMP® Certification Training PRINCE2® Foundation and Practitioner Certification

Doctor of Juridical Science (SJD)Master of Laws (LLM) - International Business and Financial Law LL.M. in Corporate & Financial Law Masters of Business Administration (MBA) in Business & Law LL.M. in Intellectual Property & Technology Law LL.M. in Dispute Resolution Contract Law Certificate Program

Job Linked

Full Stack Software Development Bootcamp Data Science Bootcamp - Advanced Certificate Programme Advanced Full Stack Development Bootcamp Job-ready Program in Financial Modelling & Analysis in association with PwC India Data Science & Analytics Bootcamp Cloud Computing Bootcamp UI/UX Design Bootcamp AI & Machine Learning Bootcamp: Master the Future of Technology

Data Science & Analytics Bootcamp Full Stack Software Development Bootcamp UI/UX Design Bootcamp Cloud Computing Bootcamp Advanced Full Stack Development Bootcamp Data Engineer Bootcamp Data Analytics Certification Program AI Engineer Bootcamp Front-End Developer Bootcamp Back-End Developer Bootcamp

Study Abroad

MS in Data Analytics MS in Project Management MS in Information Technology Masters Degree in Data Analytics and Visualization Masters Degree in Artificial Intelligence MBS in Entrepreneurship and Marketing MSc in Data Analytics MS in Data Analytics MS in Computer Science Master of Science in Business Analytics Master of Business Administration MS in Data Science MS in Information Technology Master of Business Administration MS in Applied Data Science Master of Business Administration MS in Information Technology and Administrative Management MS in Computer Science Master of Business Administration Master of Business Administration-90 ECTS MSc International Business Management MS Data Science Master of Business Administration MSc Business Intelligence and Data Science MS Data Analytics MS in Management Information Systems MSc International Business and Management MS Engineering Management MS in Machine Learning Engineering MS in Engineering Management MSc Data Engineering MSc Artificial Intelligence Engineering MPS in Informatics MS in Project Management MPS in Analytics MS in Project Management MS in Organizational Leadership MPS in Analytics - NEU Canada MBA with specialization MPS in Informatics - NEU Canada Master in Business Administration MS in Digital Marketing and Media MSc Sustainable Tourism and Event Management MSc in Circular Economy and Sustainable Innovation MSc in Impact Finance and Fintech Management MS Computer Science MBA in Technology, Innovation and Entrepreneurship MSc Data Science with Work Placement MSc Global Business Management with Work Placement MBA with Work Placement MS in Robotics and Autonomous Systems MS in Civil Engineering MS in Internet of Things MSc International Logistics and Supply Chain Management MBA- Business Informatics MSc International Management MBA in Strategic Data Driven Management MSc Digital Marketing MBA Business and Marketing MSc in Sustainable Global Supply Chain Management MSc Digital Business Analytics MSc in International Hospitality MSc Luxury and Innovation Management Master of Business Administration-International Business Management MS in Computer Engineering MS in Industrial and Systems Engineering MSc Marketing MSc Global Supply Chain Management MS in Information Systems and Technology with Business Intelligence and Analytics Concentration MSc Corporate Finance MSc Data Analytics for Business Master of Business Administration Master of Business Administration Master of Business Administration MSc in International Finance Master of Business Administration Bachelor of Business Bachelor of Business Analytics Bachelor of Information Technology Master of Business Administration MBA Business Analytics MSc in Marketing Analytics and Data Intelligence MS Biotechnology Management and Entrepreneurship MSc in Luxury and Fashion Management Master of Business Administration (90 ECTS)Bachelor of Business Administration (180 ECTS)B.Sc. Computer Science (180 ECTS) MSc in International Corporate Finance MSc in Sustainable Luxury and Creative Industries MSc Digital Marketing MSc Global Supply Chain Management (PGMP)MSc Marketing (PGMP)MSc Corporate Finance (PGMP)MSc Data Analytics for Business (PGMP)MS Business Analytics Master of Business Administration MS Quantitative Finance MS Fintech Management MS Business Analytics PGMP State University of New York Bachelors Program - STEM MSc Business Intelligence and Data Science (PGMP)MSc International Logistics and Supply Chain Management ( PGMP)MSc International Management (PGMP)MSc Psychology & Management (PGMP)State University of New York Bachelor's Year 1 Program Master of Health Services Administration M.A Digital Marketing (PGMP)MS in Technology Leadership and Project Management Master of Health Services Administration and Master of Business Administration (Dual Degree)MSc in Supply Chain Management MSc in Hospitality Management Master of Business Administration 60 ECTS UAM Master of Business Administration 90 ECTS Master of Computer Science 90 ECTS M.Engg Industrial Engineering 90 ECTS M.A in Management 90 ECTS MS in Data Analytics MS in Artificial Intelligence Master of Business Administration MA International Business & Leadership Master of Business Administration 90 ECTS MA International Hospitality Management 120 ECTS MPS in Digital Media MS Engineering Management

For College Students

Job-ready Program in Digital Marketing Job-ready Program in Financial Modelling & Analysis in association with PwC India Job-ready Program in Business Analytics Job-ready Program in Artificial Intelligence & Machine Learning Job-ready Program in Full Stack Development 101

Navigate

Home
Learn
Data Cleaning and Preparation - II in Logistic Regression

Data Cleaning and Preparation - II in Logistic Regression

$$/$$

You’ve merged your dataframes and handled the categorical variables present in them. But you still need to check the data for any outliers or missing values and treat them accordingly. Let's get this done as well.

$$/$$

You saw that one of the columns, i.e. 'TotalCharges' had 11 missing values. Since this isn't a big number compared to the number of rows present in a dataset, we decided to drop them since we won't lose much data.

Now that you have completely prepared your data, you can start with the preprocessing steps. As you might remember from the previous module, you first need to split the data into train and test sets and then rescale the features. So let’s start with that.

$$/$$

Recall that, for continuous variables, Rahim scaled the variables to standardise the three continuous variables — tenure, monthly charges and total charges. Recall that scaling basically reduces the values in a column to within a certain range — in this case, we have converted the values to the Z-scores.

For example, let’s say that, for a particular customer, tenure = 72. After standardising, the value of scaled tenure becomes:

$\frac{72 - 32.4}{24.6} = 1.61$

because for the variable tenure, mean( $μ$ ) = 32.4 and standard deviation( $σ$ ) = 24.6.

The variables had these ranges before standardisation:

Tenure = 1 to 72
Monthly charges = 18.25 to 118.80
Total charges = 18.8 to 8685

After standardisation, the ranges of the variables changed to:

Tenure = -1.28 to +1.61
Monthly charges = -1.55 to +1.79
Total charges = -0.99 to 2.83

Clearly, none of the variables will have a disproportionate effect on the model’s results now.

Churn Rate and Class Imbalance

Another thing to note here was the Churn Rate which Rahim talked about at the end of the video. You saw that the data has almost 27% churn rate. Checking the churn rate is important since you usually want your data to have a balance between the 0s and 1s (in this case churn and not-churn).

The reason for having a balance is simple. Let’s do a simple thought experiment - if you had a data with, say, 95% not-churn (0) and just 5% churn (1), then even if you predict everything as 0, you would still get a model which is 95% accurate (though it is, of course, a bad model). This problem is called class-imbalance and you'll learn to solve such cases later.

Fortunately, in this case, we have about 27% churn rate. This is neither exactly 'balanced' (which a 50-50 ratio would be called) nor heavily imbalanced. So we'll not have to do any special treatment for this dataset.

Prev

Next