Explore Courses
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Birla Institute of Management Technology Birla Institute of Management Technology Post Graduate Diploma in Management (BIMTECH)
  • 24 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Popular
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science & AI (Executive)
  • 12 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
University of MarylandIIIT BangalorePost Graduate Certificate in Data Science & AI (Executive)
  • 8-8.5 Months
upGradupGradData Science Bootcamp with AI
  • 6 months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
OP Jindal Global UniversityOP Jindal Global UniversityMaster of Design in User Experience Design
  • 12 Months
Popular
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Rushford, GenevaRushford Business SchoolDBA Doctorate in Technology (Computer Science)
  • 36 Months
IIIT BangaloreIIIT BangaloreCloud Computing and DevOps Program (Executive)
  • 8 Months
New
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Popular
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
Golden Gate University Golden Gate University Doctor of Business Administration in Digital Leadership
  • 36 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
Popular
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
Bestseller
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
IIIT BangaloreIIIT BangalorePost Graduate Certificate in Machine Learning & Deep Learning (Executive)
  • 8 Months
Bestseller
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in AI and Emerging Technologies (Blended Learning Program)
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
ESGCI, ParisESGCI, ParisDoctorate of Business Administration (DBA) from ESGCI, Paris
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration From Golden Gate University, San Francisco
  • 36 Months
Rushford Business SchoolRushford Business SchoolDoctor of Business Administration from Rushford Business School, Switzerland)
  • 36 Months
Edgewood CollegeEdgewood CollegeDoctorate of Business Administration from Edgewood College
  • 24 Months
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with Concentration in Generative AI
  • 36 Months
Golden Gate University Golden Gate University DBA in Digital Leadership from Golden Gate University, San Francisco
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Deakin Business School and Institute of Management Technology, GhaziabadDeakin Business School and IMT, GhaziabadMBA (Master of Business Administration)
  • 12 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science (Executive)
  • 12 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityO.P.Jindal Global University
  • 12 Months
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (AI/ML)
  • 36 Months
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDBA Specialisation in AI & ML
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
New
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGrad KnowledgeHutupGrad KnowledgeHutAzure Administrator Certification (AZ-104)
  • 24 Hours
KnowledgeHut upGradKnowledgeHut upGradAWS Cloud Practioner Essentials Certification
  • 1 Week
KnowledgeHut upGradKnowledgeHut upGradAzure Data Engineering Training (DP-203)
  • 1 Week
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
Loyola Institute of Business Administration (LIBA)Loyola Institute of Business Administration (LIBA)Executive PG Programme in Human Resource Management
  • 11 Months
Popular
Goa Institute of ManagementGoa Institute of ManagementExecutive PG Program in Healthcare Management
  • 11 Months
IMT GhaziabadIMT GhaziabadAdvanced General Management Program
  • 11 Months
Golden Gate UniversityGolden Gate UniversityProfessional Certificate in Global Business Management
  • 6-8 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
IU, GermanyIU, GermanyMaster of Business Administration (90 ECTS)
  • 18 Months
Bestseller
IU, GermanyIU, GermanyMaster in International Management (120 ECTS)
  • 24 Months
Popular
IU, GermanyIU, GermanyB.Sc. Computer Science (180 ECTS)
  • 36 Months
Clark UniversityClark UniversityMaster of Business Administration
  • 23 Months
New
Golden Gate UniversityGolden Gate UniversityMaster of Business Administration
  • 20 Months
Clark University, USClark University, USMS in Project Management
  • 20 Months
New
Edgewood CollegeEdgewood CollegeMaster of Business Administration
  • 23 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
KnowledgeHut upGradKnowledgeHut upGradBackend Development Bootcamp
  • Self-Paced
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 5 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
upGradupGradUI/UX Bootcamp
  • 3 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
upGradupGradDigital Marketing Accelerator Program
  • 05 Months

Linear Regression Explained with Example

Updated on 23 September, 2022

6.29K+ views
7 min read

Linear regression is one of the most common algorithms for establishing relationships between the variables of a dataset. A mathematical model is a necessary tool for data scientists in performing predictive analysis. This blog will fill you in on the fundamental concept and also discuss a linear regression example. 

What are Regression Models?

A regression model describes the relationship between dataset variables by fitting a line to the data observed. It is a mathematical analysis that sorts out which variables have an impact and matter the most. It also determines how certain we are about the factors involved. The two kinds of variables are:

  • Dependent: Factor that you are attempting to predict or understand. 
  • Independent: Factors that you suspect to have an impact on the dependent variable.

Regression models are used when the dependent variable is quantitative. It may be binary in the case of logistic regression. But in this blog, we will mainly focus on the linear regression model where both variables are quantitative.

Enrol for the Machine Learning Course from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career.

Suppose you have data on the monthly sales and average monthly rainfall for the past three years. Let’s say that you plotted this information on a chart. The y-axis represents the number of sales (dependent variable), and the x-axis depicts the total rainfall. Each dot on the chart would show how much it rained during a particular month and the corresponding sales numbers. 

If you take another glance at the data, you might notice a pattern. Presume the sales to be higher on the days it rained more. But it would be tricky to estimate how much you would typically sell when it rained a certain amount, say 3 or 4 inches. You could get some degree of certainty if you drew a line through the middle of all data points on the chart. 

Nowadays, Excel and statistics software like SPSS, R, or STATA can help you draw a line that best fits the data at hand. In addition, you can also output a formula explaining the slope of the line. 

Consider this formula for the above example: Y = 200 + 3X. It tells you that you sold 200 units when it didn’t rain at all (i.e., when X=0). Assuming that the variables stay the same as we advance, every additional inch of rain would result in an average sales of three more units. You would sell 203 units if it rains 1 inch, 206 units if it rains 2 inches, 209 inches if it rains 3 inches, and so on.

Typically, the regression line formula also includes an error term (Y = 200 + 3 X + error term). It takes into account the reality that independent predictors may not always be perfect predictors of dependent variables. And the line merely gives you an estimate based on the data available. The larger the error term, the less certain would be your regression line.

Linear Regression Basics

A simple linear regression model uses a straight line to estimate the relationship between two quantitative variables. If you have more than one independent variable, you will use multiple linear regression instead.

Simple linear regression analysis is concerned with two things. First, it tells you the strength of the relationship between the dependent and independent factors of the historical data. Second, it gives you the value of the dependent variable at a certain value of the independent variable. 

Consider this linear regression example. A social researcher interested in knowing how individuals’ income affects their happiness levels performs a simple regression analysis to see if a linear relationship occurs. The researcher takes quantitative values of the dependent variable (happiness) and independent variable (income) by surveying people in a particular geographical location. 

For instance, the data contains income figures and happiness levels (ranked on a scale from 1 to 10) from 500 people from the Indian state of Maharashtra. The researcher would then plot the data points and fit a regression line to know how much the respondents’ earnings influence their wellbeing. 

Linear regression analysis is based on a few assumptions about the data. There are:

  • Linearity of the relationship between the dependent and independent variable, i.e., the line of best fit is straight, not curved.)
  • Homogeneity of variance, meaning the size of the error in the prediction, does not change significantly across different values of the independent variable.
  • Independence of observations in the dataset, referring to no hidden relationships.
  • Normality of data distribution for the dependent variable. You can check the same using the hist() function in R.

The Math Behind Linear Regression

y = c + ax is a standard equation where y is the output (that we want to estimate), x is the input variable (that we know), a is the slope of the line, and c is the constant. 

Here, the output varies linearly based on the input. The slope determines how much x impacts the value of y. The constant is the value of y when x is nil.

Let’s understand this through another linear regression example. Imagine that you are employed in an automobile company and want to study India’s passenger vehicle market. Let’s say that the national GDP influences passenger vehicle sales. To plan better for the business, you might want to find out the linear equation of the number of vehicles sold in the country concerning the GDP

For this, you would need sample data for year-wise passenger vehicle sales and the GDP figures for every year. You might discover that the GDP of the current year affects the sales for next year: Whichever year the GDP was less, vehicle sales were lower in the subsequent year.

To prepare this data for Machine Learning analytics, you would need to do a little more work. 

  • Please start with the equation y = c + ax, where y is the number of vehicles sold in a year and x is the GDP of the prior year. 
  • To find out c and an in the above problem, you can create a model using Python.

Check out this tutorial to understand the step-by-step method

If you were to perform simple linear regression in R, interpreting and reporting results become much easier.

For the same linear regression example, let us change the equation to y=B0 + B1x + e. Again, y is the dependent variable, and x is the independent or known variable. B0 is the constant or intercept, B1 is the slope of the regression coefficient, and e is the error of the estimate. 

Statistical software like R can find the line of best fit through the data and search for the B1 that minimises the total error of the model.

Follow these steps to begin:

  • Load the passenger vehicle sales dataset into the R environment.
  • Run the command to generate a linear model that describes the relationship between passenger vehicle sales and GDP. 
    • sales.gdp.lm <- lm(gdp ~ sales, data = sales.data)
  • Use the summary() function to view the most important linear model parameters in tabulated form.
    • summary(sales.gdp.lm)

    Note: The output would contain results like calls, Residuals, and Coefficients. The ‘Call’ table states the formula used. The ‘Residuals’ details the Median, Quartiles, minimum, and maximum values to indicate how well the model fits the real data. The first row of the ‘Coefficients’ table estimates the y-intercept, and the second row gives the regression coefficient. The columns of this table have labels like Estimate, Std. Error, t value, and p-value.

  • Plug the (Intercept) value into the regression equation to predict sales values across the range of GDP numbers.
  • Investigate the (Estimate) column to know the effect. The regression coefficient would tell you how much the sales change with the change in GDP.
  • Find out the variation in your estimate of the relationship between sales and GDP from the (Std. Error) label.
  • Look at the test statistic under (t-value) to know whether the results occurred by chance. The larger the t-value, the less likely it would be.
  • Go through the Pr(>|t|) column or p-values to see the estimated effect of GDP on sales if the null hypothesis were true. 
  • Present your results with the estimated effect, standard error, and p-values, clearly communicating what the regression coefficient means.
  • Include a graph with the report. A simple linear regression can be shown as a plot chart with the regression line and function. 
  • Calculate the error by measuring the distance of the observed and predicted y values, squaring the distances at each value of x, and calculating their mean.

Conclusion

With the above linear regression example, we have given you an overview of generating a simple linear regression model, finding the regression coefficient, and calculating the error of the estimate. We also touched upon the relevance of Python and R for predictive data analytics and statistics. Practical knowledge of such tools is crucial for pursuing careers in data science and machine learning today.

If you want to hone your programming skills, check out the Advanced Certificate Programme in Machine Learning by IIT Madras and upGrad. The online course also includes case studies, projects, and expert mentorship sessions to bring industry-orientedness to the training process.