Regression is used to find the relation between variables. The variables that are used for predictions are called independent variables, and the variable that is to be predicted is called dependent variable.
While training the regression model or regression equation on data, there are times when the model goes through overfitting and underfitting due to high collinearity or because the data is not sufficient.
To determine which variables can be used as independent variables is difficult. When selecting the variables for the linear model, some people look at individual p-values. This method is not efficient because if the variables are highly correlated the p-values will also be high. On the other hand, if the selected independent variables are not related to the dependent variable then it will lead to unnecessary complexity.
To overcome these, we use regularization methods, and two of those methods are The Ridge and Lasso regression.
The equation to linear regression model
Where all the x are the independent variables, w are the coefficients, b is the intercept value, and y hat is the dependent variable.
In the above equation, we need to minimize the values of weights and bias in such a way that cost function is reduced. The cost function can be defined as:
Here, M are the number of instances and p means number of features.
Ridge and Lasso regressions will modify this cost to create the constraint.
RIDGE REGRESSION
In ridge regression, the cost function is modified by adding a penalty which is square of the magnitude of the coefficients.
The cost function of Ridge regression.
The constraint on the Ridge regression
So, ridge regression puts the constraint on the coefficient using cost function. This constraint is called L2 regularization. Lambda is the penalty term which makes sure that the coefficient should not take the large value otherwise the optimization function will be penalized. And that is how ridge shrinks the coefficient.
Lasso Regression
The cost function for Lasso regression can be written as
Here t is the cost function.
In lasso regression, instead of taking square of the coefficient, we take magnitudes of the coefficients. This type of constraint is called L1 regularization which can shrinks the less important feature’s coefficient to zero thus, removing some features altogether.
If you need implementation for any of the topics mentioned above or assignment help on any of its variants, feel free to contact us.
Comments