The term ‘multiple' in multiple linear regression is self-explanatory; it represents the relationship between two or more independent input variables and a response variable. Multiple linear regression is needed when one variable is not sufficient to create a good model and make accurate predictions.
Let’s hear Rahim talk about it.
You saw that multiple linear regression proved to be useful in creating a better model, as there was a significant change in the value of the R-squared. Recall that the R-squared for simple linear regression using 'TV' as the input variable was 0.816. When you have two variables as input, namely 'Newspaper' and 'TV', the R-squared increases to 0.836. Using 'Radio' along with 'TV' increased its value to 0.910. So, it seems that adding a new variable helps explain the variance in the data better.
It is recommended that you check the R-squared after adding these variables to see how much the model has improved.
Let’s now look at the formulation of multiple linear regression; it is just an extension of simple linear regression. Hence, the formulation is largely the same.
Most of the concepts in multiple linear regression are quite similar to those in simple linear regression. The formulation for predicting the response variable now becomes this:
However, some other aspects still remain the same: