Regression

Regression is the Supervised learning task of predicting a continuous numerical value from input features. Given a person’s age and weight, predict their blood pressure. Given a year, predict the inflation rate. Given a house’s square footage and number of bedrooms, predict its sale price. The output is a real number, not a category.

The simplest regression model is Linear regression: assume the output is a linear function of the inputs. For a single input feature $x$ ,

$f (x, w) = w_{0} + w_{1} x$

where $w_{0}$ is the intercept and $w_{1}$ is the slope. The vector $w = (w_{0}, w_{1})$ contains everything the model has learned. Training the model means finding good values for $w_{0}$ and $w_{1}$ .

Linear models can only capture linear relationships. If the data curves, a straight line is a poor fit. The natural extension is Polynomial regression:

$f (x, w) = w_{0} + w_{1} x + w_{2} x^{2} + \dots + w_{m} x^{m}$

With $m = 1$ we recover linear regression. Higher $m$ fits more complex shapes (quadratics, cubics, beyond) at the cost of more parameters and more data needed to estimate them well.

Training proceeds by:

Picking a Loss function that measures how badly predictions agree with labels. The standard choice for regression is Mean squared error.
Finding parameters $w$ that minimize the loss. For linear regression this has a closed-form solution; for more complex models we use Gradient descent.

The complementary supervised task is classification, where the output is a discrete category instead of a continuous value. The two are closely related: Logistic regression, for instance, is a classifier built on top of a linear regression by passing the output through a sigmoid.

In scikit-learn, sklearn.linear_model.LinearRegression() fits a linear regression by closed-form least squares; sklearn.linear_model.SGDRegressor() does the same with stochastic gradient descent for very large datasets.

Idriss Rami — Notes

Explorer

Regression

Graph View

Backlinks