Regression is a supervised machine learning technique that is used to find the relationship between a continuous dependent variable and the available set of independent variables.
In other words, regression is an approach to predict the value of a real or continuous variable in reference to the relationship between that variable and other input variables. For instance, predicting the salary of a person given the value of several characteristics such as his/her years of experience, education level, role in the company, etc. is an example of a regression problem.
As a simple exercise, can you accurately guess the salary of employee number ‘5’ by looking at the data given below?
|Employee No.||Years of Experience||Education Level||Role in Company||Salary (USD)|
It is very hard to do so, isn’t it? The relationship between the target variable and the given features is not that straight-forward in the above dataset. Therefore, regression algorithms are used to understand and learn the underlying data representation for finding the value of such continuous variables in relation to the available data.
There are several regression techniques in Machine Learning that are applied on the basis of the type of available variables and their data distribution.
Some of the techniques that we will be discussing in this section of the course are Simple Linear Regression, Multiple Linear Regression, Polynomial Regression, Decision Tree Regression, Random Forest Regression, and Support Vector Regression.
Each type of regression algorithm has its own significance as well as its own best-suited conditions for application. A good data scientist typically applies multiple regression algorithms to the same problem and picks out the best one through model evaluation.
We will be discussing these regression techniques and their implementation in Python one-by-one in the upcoming lessons.