Search

Hands-On Tutorial on ElasticNet Regression

Elastic Net is a regularized regression model that combines l1 and l2 penalties, i.e., lasso and ridge regression. regularization helps in overfitting problems of the models.

Elastic Net is a regression method that performs variable selection and regularization both simultaneously. The term regularization is the main concept behind the elastic net. Regularization comes into picture when the model is overfitted. Now we need to understand what overfitting means, so overfitting is a problem that occurs when the model is performing good with the training dataset, but with the test, dataset model is giving errors; in this situation the regularization is a technique to reduce the errors by fitting a function appropriately in the training dataset. These functions can be called penalties.

There are two types of penalties l1 and l2. A model which uses l1 penalty for regularization is called the lasso regression model, and the model which uses l2 penalty is called the ridge regression model. As discussed, the lasso regression model adds the absolute value of the magnitude of the coefficient as a penalty term. The ridge regression adds the squared magnitude of the coefficient as a penalty on the loss function.

Lasso stands for least absolute shrinkage and selection operator. As the name suggests in lasso regression it tries to shrink the coefficients to the absolute zero and if not possible to shrink to the absolute zero, then it eliminates the coefficient from the models. The ridge regression does not eliminate the coefficients from the model, which means it does not differentiate between important and less important predictive variables in the model and includes all of them by providing l2 penalty. It tries to shrink the unbiased coefficient by putting them with their squared magnitude into the model.

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy

Mathematically we can represent the ridge function as follows.

Image source

And the lasso function can be represented as:

Image source

Where the formula inside the box represents the penalty function by the models.

But there are certain limitations of these models- ridge regression decreases the complexity of the model in performance but does not eliminate the unbiased variables hence we can increase the model’s accuracy in a large dataset to a point. The new unbiased variable generated model can stop performing well. The lasso regression model picks the points according to the number of observations, not the predictor presented in the data. This kind of limitation can be handled and removed by the elastic net regression model where it includes both kinds of ( l1 and l2) penalties in the model.

What is Elastic Net?

Elastic Net is a regularized regression model that combines l1 and l2 penalties, i.e., lasso and ridge regression. We have discussed the limitations of lasso regression, where we found the incapability of lasso is choosing the number of predictors. The elastic net includes the penalty of lasso regression, and when used in isolation, it becomes the ridge regression. In the procedure of regularization with an elastic net, first, we find the coefficient of ridge regression. After this, we perform a lasso algorithm on the ridge regression coefficient to shrink the coefficient.

This will be easier to understand by the following diagram.

Image source

Here we can see that after performing the ridge regression, the lasso regression takes part in the procedure that considers all the variables from the dataset.

Mathematically we can represent the elastic net as follows.

Image source

Implementing ElasticNet Regression

We can perform ElasticNet in our analysis using python’s sklearn library, where the linear_model package consists of ElasticNet modules to perform an elastic net for regularization and variable selection. Next, in the article, I will compare the lasso, and elastic net regression in sklearn provided California housing data. In the data, we have got 20640 total samples with eight features. For a more detailed structure of the data, the reader can lead to this link.

``````from sklearn.datasets import fetch_california_housing

X_data, y_data = fetch_california_housing(return_X_y=True)``````

Splitting the data for training and testing purposes:

``````from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X_data, y_data, test_size=0.3)``````

Checking the shape of the data:

``````print('shape of X :', X_data.shape, 'shape of Y :', y_data.shape)
print('shape of X-train :', X_train.shape, 'shape of Y-train :', y_train.shape)
print('shape of X-test :', X_test.shape, 'shape of Y-test :', y_test.shape)``````

Output:

Here we can see the structure of the data.

Importing lasso model and fighting into a model object:

``````from sklearn.linear_model import Lasso

alpha= 0.1
model_lasso = Lasso(alpha=alpha)

print(model_lasso)``````

Output:

Fitting lasso model:

``````model_lasso.fit(X_train, y_train)

pred_lasso = model_lasso.predict(X_test)``````

Checking for the R-Squared value:

``````from sklearn.metrics import r2_score

print("r^2 of lasso on test data : %f" % r2_score(y_test, pred_lasso))``````

Output:

Here we can see the r square value for the model. Again, it is quite good but can be improved. Next, in the article, we will try to improve the performance using an elastic net regression model based on r square value.

Importing the model and defining object for it:

from sklearn.linear_model import ElasticNet

model_enet = ElasticNet(alpha=alpha, l1_ratio=0.3)

print(model_enet)

Output:

Training the model:

``````model_enet.fit(X_train, y_train)
#Testing the model:
pred_enet = model_enet.predict(X_test)
print("r^2 on test data : %f" % r2_score(y_test, pred_enet))``````

Output:

Here we can see we have improved the r square value using the ElasticNet regression. We can also visualize the performances of the model.

Decreasing coefficient  alternated signs for visualization

``````idx = np.arange(8)

coef = (-1) ** idx * np.exp(-idx / 10)
coef[10:] = 0  # sparsify coef

y = np.dot(X_data, coef)
print(y)``````

Output:

Plotting the comparison graph for sparsity coefficients.

``m, s, _ = plt.stem(np.where(model_enet.coef_)[0], model_enet.coef_[model_enet.coef_ != 0],                    markerfmt='bo', label='Elastic net coefficients')                                                                               plt.setp([m, s], color="green")                                                                                                             m, s, _ = plt.stem(np.where(model_lasso.coef_)[0], model_lasso.coef_[model_lasso.coef_ != 0],                     markerfmt='x', label='Lasso coefficients')                                                                        plt.setp([m, s], color='red')                                                                                                                   plt.stem(np.where(coef)[0], coef[coef != 0], label='true coefficients', markerfmt='bx')                      plt.legend()                                                                                                                                                  plt.title("Lasso R^2: %.3f, Elastic Net R^2: %.3f"  % (r2_score(y_test, pred_lasso),r2_score(y_test, pred_enet)))                                                                                                                                                                                 plt.show()``

Output:

Here we can see the estimated coefficients by both models, and we can also compare them. Here we can see that the lasso performed almost equally to the ElasticNet, but in some cases, the elastic net performed better than the lasso that is the reason behind the improved r square value of elastic net model.

Here we have seen in the article how we can improve the performance of the regression models by using elastic net regression models. Earlier, we discussed the limitation of ridge and lasso regression and compare the performance score between lasso and ElasticNet. Many parameters can cause drastic changes in performances that the cross-validation methods can cross-check. I encourage you to perform those methods with the model as well to get more accurate results.

References

All the information in the article is gathered from:

Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Data Science Hiring Process at Pegasystems

For data science roles, Pega focuses on the candidate’s ability to learn and adapt rather

Alibaba’s latest Qwen model can save its failing cloud business.

Genpact, AWS Collaborate to Revolutionise Insurance Claims Lifecycle

Genpact also announced an expanded collaboration with AWS aimed at revolutionising financial crime risk operations

7 Bizarre Things About ChatGPT You Wish You Knew

Ever wondered why it’s called ChatGPT?

China Open Sources DeepSeek LLM, Outperforms Llama 2 and Claude-2

DeepSeek LLM 7B/67B models, including base and chat versions, are released to the public on

Apple’s Scary New Innovation Gives Voice to the Voiceless

Apple’s latest innovation, Personal Voice, unveiled just before the International Day of Persons with Disabilities,

9 Must-Know Open Source Models From Meta in 2023

Meta has been synonymous with open source ecosystems. Recently, its research arm, FAIR, completed 10

AI Assists Production in Indian Film Industry

Implementing AI in pre-production can bring down storyboarding process time by 50-80% and reduce the

Is GPT-4 Really Better than Radiologists?

“Radiology report summaries created by GPT-4 are comparable, and in some cases, even preferred over

TSMC: The Wizard Behind AI’s Curtain

TSMC anticipates a substantial CAGR of nearly 50% in the AI sector from 2022 to 2027.