What is Regression in Machine Learning? Complete Guide

In this article, we will explain what is regression in machine learning.

What is Regression?

Regression is one type of supervised learning. This technique is used to predict continuous outcomes, unlike classification, where we predict categorical values.

There are 3 types of machine learning: supervised, unsupervised, and reinforcement learning.

In other words, in regression, we investigate the relationship between independent variables (the features) and the dependent variable (the outcome or also the value we want to predict).

In regression problems, the output variable is a real or a continuous value such as salary or weight.

Here is an example of a dataset used in regression to predict salary:

The features (the independent features) are: years of experience and the outcome (the dependent variable) is: the salary.

In most cases, we have more than one feature in our datasets!

Test your Knowledge on Regression

Exercise:

Let’s see if those examples are regression problems!

Predicting the age of a person? Yes, because age is a continuous value.

Predicting whether a tumor is benign or malignant? No, this is a classification problem because benign and malignant are categories.

Predicting the height of a person? Yes, because height is a continuous value.

Predicting the nationality of a person? No, this is a classification problem.

Predicting whether the stock price of a company will increase next month? Yes, because the stock price is a continuous value.

Popular Regression Algorithms

Here are some popular regression algorithms.

First, linear regression. It works by estimating coefficients for a line or a hyper plan that best fits certain data. This algorithm is simple and fast to train and gives a great performance if the outcome is a linear combination of the features.

The second algorithm is K nearest neighbor. This algorithm is used for both classification and regression. KNN regressor takes the mean of the K most similar instances in the training dataset.

The third algorithm is the decision tree. It works by creating a tree to evaluate an instance of data. Starting at the root of the tree and moving down to the leaves until a prediction can be made.

The fourth algorithm is the support vector machine. It works by finding a line of best fit that minimizes the error of a cost function. This is done using an optimization process that’s only considering those data instances in the training data sets. That’s are closest to the line with the minimum costs.

The last algorithm is multi-layer perception. This algorithm approximates a function that best fits the real value output using artificial neurons.

Real-life Examples of Regression

Let’s explore some real-life examples of regression.

  • Businesses, often use regression to understand the relationship between advertising, spending, and revenue.
  • Medical researchers often use regression to understand the relationship between drug dosage and blood pressure of patients,
  • Data scientists for professional sports teams use regression to measure the effects that different training regiments have on player performance.
  • Organizations often use regression models to forecast future sales. This can be helpful for things like budgeting and planning. A lot of businesses use regression models to predict how stocks will perform in the future. This is done by analyzing the best data on stock prices and trends to identify patterns.
  • Businesses can use regression to predict how much a customer is likely to spend. Regression models can also be used to predict consumer behavior.

Do you have any questions? A point is not clear?
Leave a comment below.