Explore Courses
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Birla Institute of Management Technology Birla Institute of Management Technology Post Graduate Diploma in Management (BIMTECH)
  • 24 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Popular
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science & AI (Executive)
  • 12 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
University of MarylandIIIT BangalorePost Graduate Certificate in Data Science & AI (Executive)
  • 8-8.5 Months
upGradupGradData Science Bootcamp with AI
  • 6 months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
OP Jindal Global UniversityOP Jindal Global UniversityMaster of Design in User Experience Design
  • 12 Months
Popular
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Rushford, GenevaRushford Business SchoolDBA Doctorate in Technology (Computer Science)
  • 36 Months
IIIT BangaloreIIIT BangaloreCloud Computing and DevOps Program (Executive)
  • 8 Months
New
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Popular
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
Golden Gate University Golden Gate University Doctor of Business Administration in Digital Leadership
  • 36 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
Popular
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
Bestseller
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
IIIT BangaloreIIIT BangalorePost Graduate Certificate in Machine Learning & Deep Learning (Executive)
  • 8 Months
Bestseller
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in AI and Emerging Technologies (Blended Learning Program)
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
ESGCI, ParisESGCI, ParisDoctorate of Business Administration (DBA) from ESGCI, Paris
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration From Golden Gate University, San Francisco
  • 36 Months
Rushford Business SchoolRushford Business SchoolDoctor of Business Administration from Rushford Business School, Switzerland)
  • 36 Months
Edgewood CollegeEdgewood CollegeDoctorate of Business Administration from Edgewood College
  • 24 Months
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with Concentration in Generative AI
  • 36 Months
Golden Gate University Golden Gate University DBA in Digital Leadership from Golden Gate University, San Francisco
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Deakin Business School and Institute of Management Technology, GhaziabadDeakin Business School and IMT, GhaziabadMBA (Master of Business Administration)
  • 12 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science (Executive)
  • 12 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityO.P.Jindal Global University
  • 12 Months
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (AI/ML)
  • 36 Months
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDBA Specialisation in AI & ML
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
New
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGrad KnowledgeHutupGrad KnowledgeHutAzure Administrator Certification (AZ-104)
  • 24 Hours
KnowledgeHut upGradKnowledgeHut upGradAWS Cloud Practioner Essentials Certification
  • 1 Week
KnowledgeHut upGradKnowledgeHut upGradAzure Data Engineering Training (DP-203)
  • 1 Week
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
Loyola Institute of Business Administration (LIBA)Loyola Institute of Business Administration (LIBA)Executive PG Programme in Human Resource Management
  • 11 Months
Popular
Goa Institute of ManagementGoa Institute of ManagementExecutive PG Program in Healthcare Management
  • 11 Months
IMT GhaziabadIMT GhaziabadAdvanced General Management Program
  • 11 Months
Golden Gate UniversityGolden Gate UniversityProfessional Certificate in Global Business Management
  • 6-8 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
IU, GermanyIU, GermanyMaster of Business Administration (90 ECTS)
  • 18 Months
Bestseller
IU, GermanyIU, GermanyMaster in International Management (120 ECTS)
  • 24 Months
Popular
IU, GermanyIU, GermanyB.Sc. Computer Science (180 ECTS)
  • 36 Months
Clark UniversityClark UniversityMaster of Business Administration
  • 23 Months
New
Golden Gate UniversityGolden Gate UniversityMaster of Business Administration
  • 20 Months
Clark University, USClark University, USMS in Project Management
  • 20 Months
New
Edgewood CollegeEdgewood CollegeMaster of Business Administration
  • 23 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
KnowledgeHut upGradKnowledgeHut upGradBackend Development Bootcamp
  • Self-Paced
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 5 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
upGradupGradUI/UX Bootcamp
  • 3 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
upGradupGradDigital Marketing Accelerator Program
  • 05 Months

Logistic Regression in R: Equation Derivation [With Example]

Updated on 06 June, 2023

6.03K+ views
11 min read

In this article, we’ll discuss one of the most common yet challenging concepts in machine learning, logistic regression. You’ll find what logistic regression is and the derivation of the logistic regression equation in this detailed article. 

We’ve also shared an example of logistic regression in R to understand the concept with much ease. However, ensure that you know all the ideas reasonably well before you work on the example. It would be helpful if you’re familiar with linear regression because both of these concepts are interlinked.

What is Logistic Regression?

Logistic regression predicts a binary outcome according to a set of independent variables. It is a classification algorithm that predicts the probability of an event’s occurrence using a logit function and fitting data to it. Logistic regression is different from linear regression as it can predict the likelihood of a result that can only have two values. Using linear regression is not suitable when you have a binary variable because:

  • The linear regression would predict values outside the required range
  • The regression might not distribute the two benefits across one predicted line

Logistic regression doesn’t produce a line as a linear regression does. It provides a logistic curve that ranges between 0 and value more than 1. 

Learn data science online courses from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

Check out: R Project Ideas

Logistic Regression Equation Derivation

We can derive the logistic regression equation from the linear regression equation. Logistic regression falls under the class of glm algorithms (Generalized Linear Model). Nelder and Wedderburn introduced this model in 1972 as a method of using linear regression to solve problems that it couldn’t solve before. They had proposed a class of separate models and had added logistic regression as a special one. 

We know that the equation of a generalized linear model is the following:

g(e<y) = a + bx1

g() stands for the link function, E(y) stands for the expectation of the target variable, and the RHS (right-hand side) is the linear predictor. The link function ‘links’ the expectation of y with the linear predictor. 

Suppose we have data of 100 clients, and we need to predict whether a client will buy a specific product or not. As we have a categorical outcome variable, we must use logistic regression. 

We’ll start with a linear regression equation:

g(y) = o+(income) — (1)

Here, we’ve kept the independent variable as ‘income’ for ease of understanding. 

Our focus is on the probability of the resultant dependent variable (will the customer buy or not?). As we’ve already discussed, g() is our link function, and it is based on the Probability of Success (p) and Probability of Failure (1-p). p should have the following qualities:

  • p should always be positive 
  • p should always be less than or equal to 1

Now, we’ll denote g() with ‘p’ and derive our logistic regression equation. 

As probability is always positive, we’ll cover the linear equation in its exponential form and get the following result:

p = exp(0+(income)) = e((0+(income)) — (2)

We’ll have to divide p by a number greater than p to make the probability less than 1:

p = exp(0+(income)) / (0+(income)) + 1 = e(0+(income)) / (0+(income)) + 1 — (3)

By using eq. (1), (2), and (3), we can define p as:

p = ey /1 + ey — (4)

Here, p is the probability of success, so 1-p must be the probability of failure:

q = 1 – p = 1 -(ey /1 + ey) — (5)

Let’s now divide (4) by (5):

p / 1 – p = ey

If we take log on both sides, we get the following:

log (p / 1 – p) = y

This is the link function. When we substitute the value of y we had established previously, we get:

log(p / 1 – p) = o + (income)

And there we have it, the logistic regression equation. As it provides the probability of a result, its value always remains between 0 and above 1. 

Read About: 9 Interesting Linear Regression Project Ideas & Topics For Beginners

Example of Logistic Regression in R

In our case of logistic regression in R, we’re using data from UCLA (University of California, Los Angeles). Here, we have to create a model that predicts the chances of getting admit according to the data we have. We have four variables, including GPA, GRE score, the rank of the student’s undergraduate college, and confess. 

df <- read.csv(“https://stats.idre.ucla.edu/stat/data/binary.csv”)

str(df)

## ‘data.frame’: 400 obs. of 4 variables:

## $ admit: int 0 1 1 1 0 1 1 0 1 0 …

## $ gre : int 380 660 800 640 520 760 560 400 540 700 …

## $ gpa : num 3.61 3.67 4 3.19 2.93 3 2.98 3.08 3.39 3.92 …

$ rank : int 3 3 1 4 4 2 1 2 3 2 …

Variables are either number or integer:

sum(is.na(df))

## [1] 0

We also find that there are no null values, and there are more events of rejects than of acceptance because the mean of the variable limit is smaller than 0.5.

You should make sure that the system distributes admits appropriately in every category of rank. Suppose one rank has only 5 rejects (or admit information), then you don’t necessarily have to use that rank in your analysis. 

xtabs(~ admit +rank ,data=df)

## rank

## admit 1 2 3 4

## 0 28 97 93 55

## 1 33 54 28 12

Let’s run our function now:

df$rank <- as.factor(df$rank)

logit <- glm(admit ~ gre+gpa+rank,data=df,family=”binomial”)

summary(logit)

##

## Call:

## glm(formula = admit ~ gre + gpa + rank, family = “binomial”,

## data = df)

##

## Deviance Residuals:

## Min 1Q Median 3Q Max 

## -1.6268 -0.8662 -0.6388 1.1490 2.0790 

##

## Coefficients:

## Estimate Std. Error z value Pr(>|z|)   

## (Intercept) -3.989979 1.139951 -3.500 0.000465 ***

## gre 0.002264 0.001094 2.070 0.038465 * 

## gpa 0.804038 0.331819 2.423 0.015388 * 

## rank2 -0.675443 0.316490 -2.134 0.032829 * 

## rank3 -1.340204 0.345306 -3.881 0.000104 ***

## rank4 -1.551464 0.417832 -3.713 0.000205 ***

## —

## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

##

## (Dispersion parameter for binomial family taken to be 1)

##

## Null deviance: 499.98 on 399 degrees of freedom

## Residual deviance: 458.52 on 394 degrees of freedom

## AIC: 470.52

##

## Number of Fisher Scoring iterations: 4

You must’ve noticed that we have converted the rank variable to factor from integer before running the function. Make sure that you do the same. 

upGrad’s Exclusive Data Science Webinar for you –

Watch our Webinar on The Future of Consumer Data in an Open Data Economy

667e6979106885606c1188a5

Final Result:

Suppose a student’s GPA is 3.8, a GRE score of 790, and he studied in a rank-1 college. Let’s find his chances of getting admit in the future by using our model:

x <- data.frame(gre=790,gpa=3.8,rank=as.factor(1))

p<- predict(logit,x)

p

## 1

## 0.85426

Our model predicts that the boy has an 85% chance of getting the admit in the future. 

Also Read: Machine Learning Project Ideas

Final Thoughts

That’s it for this article. We’re confident that you’d have found it quite helpful. If you have any questions or thoughts on logistic regression and its related topics, please share them in the comment section below. 

If you are curious to learn about R, everything about data science, check out IIIT-B & upGrad’s Executive PG Program in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Regularization Techniques in Logistic Regression

L1 and L2 Regularization:

  • Regularization methods like L1 and L2 regularization are crucial in logistic regression in R.
  • These techniques address issues like overfitting and improve the model’s generalization capabilities.
  • Implement regularization in R using packages like glmnet or caret.
  • Penalty terms are introduced to the logistic regression equation, controlling the impact of each independent variable and enhancing the model’s predictive performance.

Model Evaluation and Performance Metrics: To assess the accuracy and reliability of a logistic regression model, it’s essential to evaluate its performance. In R, you can utilize various performance metrics like accuracy, precision, recall, F1-score, and area under the ROC curve (AUC-ROC). These metrics provide insights into the model’s ability to correctly classify instances and quantify the trade-offs between true positives and false positives. Techniques such as cross-validation and train-test splits help assess the model’s robustness and prevent overfitting.

Dealing with Imbalanced Data

Imbalanced datasets, where one class has significantly fewer instances than the other, are prevalent in real-world scenarios. Handling imbalanced data is crucial to prevent biased predictions and ensure reliable model performance. In logistic regression in R programming, where the goal is to accurately classify instances into binary outcomes, imbalanced data can pose challenges and lead to skewed results.

R provides several effective techniques to address the issue of imbalanced datasets and improve the model’s performance in such scenarios. These techniques include oversampling, undersampling, and the synthetic minority oversampling technique (SMOTE). Let’s explore each technique in more detail:

  1. Oversampling: Oversampling involves increasing the number of instances in the minority class to match the majority class. This technique aims to balance the dataset by creating synthetic or replicated samples of the minority class. By increasing the representation of the minority class, oversampling helps the model capture the patterns and characteristics of both classes more accurately.
  2. Undersampling: Undersampling, on the other hand, involves reducing the number of instances in the majority class to achieve a balanced dataset. This technique randomly removes instances from the majority class, eliminating the class imbalance. Undersampling can be a useful approach when the majority class has a large number of redundant or similar instances, and reducing their quantity does not significantly impact the overall information contained in the dataset with logit in R.
  3. Synthetic Minority Oversampling Technique (SMOTE): SMOTE is a popular technique that generates synthetic instances of the minority class to balance the dataset. Instead of simply replicating instances, SMOTE creates synthetic samples by interpolating between existing instances of the minority class. By introducing synthetic examples, SMOTE diversifies the dataset and helps the model learn more robust decision boundaries.

By applying these techniques, you can address the class imbalance issue in logistic regression. This balancing act allows the model to train on a representative dataset and make accurate predictions for both classes. It is important to note that the choice of oversampling, undersampling, or SMOTE depends on the specific characteristics of the dataset and the problem at hand. Experimentation and evaluation of different techniques are crucial to find the most effective approach.

Multicollinearity and Feature Selection

Multicollinearity:

  • Multicollinearity refers to a high correlation among independent variables in logistic regression.
  • It leads to unstable coefficient estimates and decreased interpretability.
  • Detect and mitigate multicollinearity in R using methods like variance inflation factor (VIF) analysis and correlation matrices.

Feature Selection Techniques:

  • Feature selection helps identify the most influential variables and improves model efficiency.
  • R offers various feature selection techniques such as stepwise regression, lasso regression, and recursive feature elimination.
  • These techniques assist in selecting a subset of relevant features and improving model interpretability and performance.

Interpreting Logistic Regression Coefficients: Understanding the impact of each independent variable on the outcome is a fundamental aspect of logistic regression. The estimated coefficients offer insights into the direction and magnitude of the relationships between predictors and the log odds of the binary outcome. By exponentiating the coefficients, they can be interpreted as odds ratios, indicating how the odds of the outcome change with a unit increase in the predictor. R’s summary output of the logistic regression model provides these coefficients along with their standard errors, z-values, and p-values.

Frequently Asked Questions (FAQs)

1. How are logistic regression and linear regression different from each other?

Linear regression's goal is to identify the best-fitting line, but logistic regression goes one step farther and fits the line values to the sigmoid curve. The main difference between these two methods is that logistic regression is applied when the dependent variable is binary. When the dependent variable is continuous and the regression line is linear, linear regression is used. While the ordinary least squares technique is used to estimate linear regression, the maximum likelihood estimation method is used to estimate logistic regression. The output of linear regression is continuous, but that of logistic regression has only a restricted number of discrete potential values.

2. When is the use of logistic regression helpful?

The categorical dependent variable is predicted using logistic regression. When the data we have can be measured on an infinite scale, we can apply logistic regression. For estimating the likelihood of an occurrence, logistic regression analysis is useful. It assists in determining the probability of any two classes. Only classification and probability outcomes may be predicted using logistic regression. It may be used to solve a variety of classification issues like spam detection, diabetes prediction, cancer diagnosis, and so on.

3. What are the limitations of using logistic regression?

1. Since logistic regression has a linear decision surface, it cannot address non-linear issues.
2. The logistic regression algorithm is sensitive to outliers.
3. As both scaling and normalization are key criteria of Logistic Regression, data preparation can be time consuming.
4. If a feature exists that completely separates two classes, the model can no longer be trained. This is termed as 'complete separation.'
5. If the number of observations is fewer than the number of features, logistic regression should not be applied as it may result in overfitting.
6. Another disadvantage is that each data point in logistic regression needs to be independent of all other data points. When observations are connected, the model tends to overestimate the relevance of those observations.