Explore Courses
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Birla Institute of Management Technology Birla Institute of Management Technology Post Graduate Diploma in Management (BIMTECH)
  • 24 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Popular
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science & AI (Executive)
  • 12 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
University of MarylandIIIT BangalorePost Graduate Certificate in Data Science & AI (Executive)
  • 8-8.5 Months
upGradupGradData Science Bootcamp with AI
  • 6 months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
OP Jindal Global UniversityOP Jindal Global UniversityMaster of Design in User Experience Design
  • 12 Months
Popular
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Rushford, GenevaRushford Business SchoolDBA Doctorate in Technology (Computer Science)
  • 36 Months
IIIT BangaloreIIIT BangaloreCloud Computing and DevOps Program (Executive)
  • 8 Months
New
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Popular
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
Golden Gate University Golden Gate University Doctor of Business Administration in Digital Leadership
  • 36 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
Popular
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
Bestseller
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
IIIT BangaloreIIIT BangalorePost Graduate Certificate in Machine Learning & Deep Learning (Executive)
  • 8 Months
Bestseller
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in AI and Emerging Technologies (Blended Learning Program)
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
ESGCI, ParisESGCI, ParisDoctorate of Business Administration (DBA) from ESGCI, Paris
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration From Golden Gate University, San Francisco
  • 36 Months
Rushford Business SchoolRushford Business SchoolDoctor of Business Administration from Rushford Business School, Switzerland)
  • 36 Months
Edgewood CollegeEdgewood CollegeDoctorate of Business Administration from Edgewood College
  • 24 Months
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with Concentration in Generative AI
  • 36 Months
Golden Gate University Golden Gate University DBA in Digital Leadership from Golden Gate University, San Francisco
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Deakin Business School and Institute of Management Technology, GhaziabadDeakin Business School and IMT, GhaziabadMBA (Master of Business Administration)
  • 12 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science (Executive)
  • 12 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityO.P.Jindal Global University
  • 12 Months
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (AI/ML)
  • 36 Months
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDBA Specialisation in AI & ML
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
New
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGrad KnowledgeHutupGrad KnowledgeHutAzure Administrator Certification (AZ-104)
  • 24 Hours
KnowledgeHut upGradKnowledgeHut upGradAWS Cloud Practioner Essentials Certification
  • 1 Week
KnowledgeHut upGradKnowledgeHut upGradAzure Data Engineering Training (DP-203)
  • 1 Week
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
Loyola Institute of Business Administration (LIBA)Loyola Institute of Business Administration (LIBA)Executive PG Programme in Human Resource Management
  • 11 Months
Popular
Goa Institute of ManagementGoa Institute of ManagementExecutive PG Program in Healthcare Management
  • 11 Months
IMT GhaziabadIMT GhaziabadAdvanced General Management Program
  • 11 Months
Golden Gate UniversityGolden Gate UniversityProfessional Certificate in Global Business Management
  • 6-8 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
IU, GermanyIU, GermanyMaster of Business Administration (90 ECTS)
  • 18 Months
Bestseller
IU, GermanyIU, GermanyMaster in International Management (120 ECTS)
  • 24 Months
Popular
IU, GermanyIU, GermanyB.Sc. Computer Science (180 ECTS)
  • 36 Months
Clark UniversityClark UniversityMaster of Business Administration
  • 23 Months
New
Golden Gate UniversityGolden Gate UniversityMaster of Business Administration
  • 20 Months
Clark University, USClark University, USMS in Project Management
  • 20 Months
New
Edgewood CollegeEdgewood CollegeMaster of Business Administration
  • 23 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
KnowledgeHut upGradKnowledgeHut upGradBackend Development Bootcamp
  • Self-Paced
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 5 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
upGradupGradUI/UX Bootcamp
  • 3 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
upGradupGradDigital Marketing Accelerator Program
  • 05 Months

Polynomial Regression: Importance, Step-by-Step Implementation

Updated on 26 September, 2022

8.54K+ views
10 min read

Introduction

In this vast field of Machine Learning, what would be the first algorithm that most of us would have studied? Yes, it is the Linear Regression. Mostly being the first program and algorithm that one would have learned in their initial days of Machine Learning Programming, Linear Regression has its own importance and power with a linear type of data.

What if the dataset we come across is not linearly separable? What if the linear regression model is not able to derive any sort of relationship between both the independent and dependent variables?

There comes another type of regression known as the Polynomial Regression. True to its name, Polynomial Regression is a regression algorithm that models the relationship between the dependent (y) variable and the independent variable (x) as an nth degree polynomial. In this article, we shall understand the algorithm and math behind Polynomial Regression along with its implementation in Python.

What is Polynomial Regression?

As defined earlier, Polynomial Regression is a special case of linear regression in which a polynomial equation with a specified (n) degree is fit on the non-linear data which forms a curvilinear relationship between the dependent and independent variables.

y= b0+b1x1+ b2x12+ b3x13+…… bnx1n

Here,

y is the dependent variable (output variable)

x1 is the independent variable (predictors)

b0 is the bias

b1, b2, ….bn are the weights in the regression equation.

As the degree of the polynomial equation (n) becomes higher, the polynomial equation becomes more complicated and there is a possibility of the model tending to overfit which will be discussed in the later part.

Comparison of Regression Equations
 

Simple Linear Regression ===>         y= b0+b1x

Multiple Linear Regression ===>     y= b0+b1x1+ b2x2+ b3x3+…… bnxn

Polynomial Regression ===>         y= b0+b1x1+ b2x12+ b3x13+…… bnx1n

From the above three equations, we see that there are several subtle differences in them. The Simple and Multiple Linear Regressions are different from the Polynomial Regression equation in that it has a degree of only 1. The Multiple Linear Regression consists of several variables x1, x2, and so on. Though the Polynomial Regression equation has only one variable x1, it has a degree n which differentiates it from the other two.

Need for Polynomial Regression

From the below diagrams we can see that in the first diagram, a linear line is attempted to be fit on the given set of non-linear datapoints. It is understood that it becomes very difficult for a straight line to form a relationship with this non-linear data. Because of this when we train the model, the loss function increases causing the high error.

On the other hand, when we apply Polynomial Regression it is clearly visible that the line fits well on the data points. This signifies that the polynomial equation that fits the datapoints derives some sort of relationship between the variables in the dataset. Thus, for such cases where the data points are arranged in a non-linear manner, we require the Polynomial Regression model.

Implementation of Polynomial Regression in Python

From here, we shall build a Machine Learning model in Python implementing Polynomial Regression. We shall compare the results obtained with Linear Regression and Polynomial Regression. Let us first understand the problem that we are going to solve with Polynomial Regression.

Problem Description

In this, consider the case of a Start-up looking to hire several candidates from a company. There are different openings for different job roles in the company. The start-up has details of the salary for each role in the previous company. Thus, when a candidate mentions his or her previous salary, the HR of the start-up needs to verify it with the existing data. Thus, we have two independent variables which are Position and Level. The dependent variable (output) is the Salary which is to be predicted using Polynomial Regression.

On visualizing the above table in a graph, we see that the data is non-linear in nature. In other words, as the level increases the salary increases at a higher rate thus giving us a curve as shown below.

Step 1: Data Pre-Processing

The first step in building any Machine Learning model is to import the libraries. Here, we have only three basic libraries to be imported. After this, the dataset is imported from my GitHub repository and the dependent variables and independent variables are assigned. The independent variables are stored in the variable X and the dependent variable is stored in the variable y.

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

dataset = pd.read_csv(‘https://raw.githubusercontent.com/mk-gurucharan/Regression/master/PositionSalaries_Data.csv’)

X = dataset.iloc[:, 1:-1].values

y = dataset.iloc[:, -1].values

Here in the term [:, 1:-1], the first colon represents that all rows must be taken and the term 1:-1 denotes that the columns to be included are from the first column to the penultimate column which is given by -1.

Step 2: Linear Regression Model

In the next step, we shall build a Multiple Linear Regression model and use it to predict the salary data from the independent variables. For this, the class LinearRegression is imported from the sklearn library. It is then fitted on the variables X and y for training purposes.

from sklearn.linear_model import LinearRegression

regressor = LinearRegression()

regressor.fit(X, y)

Once the model is built, on visualizing the results, we get the following graph.

As it is clearly seen, by trying to fit a straight line on a non-linear dataset, there is no relationship that is derived by the Machine Learning model. Thus, we need to go for Polynomial Regression to get a relationship between the variables.

Step 3: Polynomial Regression Model

In this next step, we shall fit a Polynomial Regression model on this dataset and visualize the results. For this, we import another Class from the sklearn module named as PolynomialFeatures in which we give the degree of the polynomial equation to be built. Then the LinearRegression class is used to fit the Polynomial equation to the dataset.

from sklearn.preprocessing import PolynomialFeatures

from sklearn.linear_model import LinearRegression

poly_reg = PolynomialFeatures(degree = 2)

X_poly = poly_reg.fit_transform(X)

lin_reg = LinearRegression()

lin_reg.fit(X_poly, y)

In the above case, we have given the degree of the polynomial equation to be equal to 2. On plotting the graph, we see that there is some sort of curve that is derived but still there is much deviation from the real data (in red) and the predicted curve points (in green). Thus, in the next step we shall increase the degree of the polynomial to higher numbers such as 3 & 4 and then compare it with each other.

On comparing the results of the Polynomial Regression with degrees 3 and 4, we see that as the degree increases, the model trains well with the data. Thus, we can infer that a higher degree enables the Polynomial equation to fit more accurately on the training data. However, this is the perfect case of overfitting. Thus, it becomes important to choose the value of n precisely to prevent overfitting.

What is Overfitting?

As the name says, Overfitting is termed as a situation in statistics when a function (or a Machine Learning model in this case) is too closely fit on to a set of limited data points. This causes the function to perform poorly with new data points.

In Machine Learning if a model is said to be overfitting on a given set of training data points, then when the same model is introduced to a completely new set of points (say the test dataset), then it performs very badly on it as the overfitting model hasn’t generalized well with the data and is only overfitting on the training data points.

Also Read: Machine Learning Project Ideas

In polynomial regression, there is a good chance of the model getting overfit on the training data as the degree of the polynomial is increased. In the example shown above, we see a typical case of overfitting in polynomial regression which can be corrected with only a trial-and-error basis for choosing the optimal value of the degree.

Conclusion

To conclude, Polynomial Regression is utilized in many situations where there is a non-linear relationship between the dependent and independent variables. Though this algorithm suffers from sensitivity towards outliers, it can be corrected by treating them before fitting the regression line. Thus, in this article, we have been introduced to the concept of Polynomial Regression along with an example of its implementation in Python Programming on a simple dataset.

If you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.

Learn ML Course from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career.

Frequently Asked Questions (FAQs)

1. What do you mean by linear regression?

Linear regression is a type of predictive numerical analysis through which we can find the value of an unknown variable with the help of a dependent variable. It also explains the connection between one dependent and one or more independent variables. Linear regression is a statistical technique for demonstrating a link between two variables. Linear regression plots a trend line from a set of data points. Linear regression can be used to generate a prediction model from seemingly random data, such as cancer diagnoses or stock prices. There are several methods for calculating linear regression. The ordinary least-squares approach, which estimates unknown variables in data and visually transforms into the sum of the vertical distances between the data points and the trend line, is one of the most prevalent.

2. What are some of Linear Regression's drawbacks?

In most cases, regression analysis is used in research to establish that there is a link between variables. However, correlation does not imply causation since a link between two variables does not imply that one causes the other to happen. Even a line in a basic linear regression that suits the data points well may not ensure a relationship between circumstances and logical outcomes. Using a linear regression model, you may determine whether or not there is any correlation between variables. Extra investigation and statistical analysis will be required to determine the exact nature of the link and whether one variable causes the other.

3. What are the basic assumptions of linear regression?

In linear regression, there are three key assumptions. The dependent and independent variables must, first and foremost, have a linear connection. A scatter plot of the dependent and independent variables is used to check this relationship. Second, there should be minimal or zero multi-collinearity between the independent variables in the dataset. It implies that the independent variables are unrelated. The value must be limited, which is determined by the domain requirement. Homoscedasticity is the third factor. The assumption that errors are evenly distributed is one of the most essential assumptions.