# Assumptions of Linear Regression

Linear regression is a modeling approach that attempts to establish a relationship between a dependent variable and one or more independent variables by fitting a linear equation. There are five assumptions of linear regression. This is what is going to be covered in this blog post.

## Linear Relationship

All features should have a linear relationship with the target variable. It is important to check for outliers since linear regression is sensitive to outliers. This relationship can be checked with the use of a scatter plot.

## Normal distribution

All features should be normally distributed. It may be necessary to apply some type of transformation such as logarithmic transformation for the feature to have a normal distribution.

## No or little multicollinearity

Multicollinearity happens when the independent features are too correlated with one another. Correlation can be checked by using df.corr(). Features can be positively or negatively correlated. If two features are two highly correlated, you may choose to remove one or take an average of both, create a new feature and drop them.

## No auto-correlation

The residuals need to be independent of one another; the value of y(x) needs to be independent of the value of y(x+1).

## Homoscedasticity

The noise/random distribution of the residuals need to have the same variation across the linear regression.