Regularization in ML explained simply | Lasso (L1) and Ridge (L2) | Foundations for ML [Lecture 27]
Автор: Vizuara
Загружено: 2025-02-11
Просмотров: 6130
I first heard “regularization” during MIT’s graduate-level machine learning course in the fall of 2019. Later, a couple of friends mentioned it during their ML job interviews—specifically, they were asked about “Lasso and Ridge regression.” That’s when I realized that regularization is a key concept I needed to understand better.
For new topics, I usually start by Googling “Topic XYZ visually explained.” So, I typed “Regularization visualized” into Google Images and was both amazed and a bit overwhelmed by the figures I saw. Even though the math behind regularization looked straightforward (just apply a penalty term to the loss function), something didn’t add up.
As I learned more about Lasso, I became confused: Why does Lasso force some model parameters to be exactly zero, while Ridge only makes them small? I set that confusing part aside, not knowing that it would eventually unlock the full beauty of regularization for me. Today, I truly appreciate that visual intuition—even though for 2 or 3 years I paid little attention to it.
In this video, I’ll explain regularization in the simplest way possible. I cover:
• What is Regularization?
Regularization is used in machine learning to prevent overfitting and improve a model’s ability to generalize to new, unseen data. By adding a penalty to the loss function, the model is discouraged from learning overly complex patterns or noise that only fits the training data. This penalty simplifies the model by constraining its parameters, so it focuses on the most important features.
• How Does Regularization Work?
Think of an ML model that aims to minimize its loss function. Regularization modifies this loss function by adding a penalty term that prevents the model parameters from becoming too high when fitting noisy data. The regularization strength, denoted by λ, controls the trade-off between the original loss and the penalty.
• Types of Regularization: Ridge (L2) vs. Lasso (L1) Regression
Ridge Regression (L2 Regularization):
It modifies the linear regression loss function by adding an L2 penalty (the sum of squared weights). When λ is 0, Ridge is just like normal linear regression. As λ increases, the model shrinks all weights closer to 0 to help prevent overfitting.
Lasso Regression (L1 Regularization):
It uses an L1 penalty, adding the absolute values of the weights. With a small λ, Lasso behaves like linear regression. But when λ is large, Lasso forces some weights to become exactly zero—effectively performing feature selection since features with a weight of zero are not used for making predictions.
• Why Does Lasso Set Some Weights to Zero But Not Ridge?
This was the million-dollar question that frustrated me for quite some time. Here’s the intuition:
Ridge:
Even when λ is large, Ridge regression only shrinks the parameters, making them small but not exactly zero.
Lasso:
There are many cases where, even with a high λ, Lasso can set parameters to exactly zero. While I can’t paste equations and images here, imagine a graphical illustration where the penalty shapes differ: Ridge’s penalty forms a circle, while Lasso’s forms a diamond. The diamond’s corners make it more likely for the optimization process to land on an axis (i.e., setting a parameter to zero), whereas the circular shape of Ridge doesn’t encourage exact zeros.
• How Do You Select a Good Value for λ?
There’s no strict rule, but I’ll share some primary considerations and practical insights—especially if you’re using scikit-learn in Python.
I highly recommend checking out the brilliant visuals and explanations on explained.ai (link in the video description) for even more insight into these concepts.
If you’re interested in understanding the full beauty and intuition behind regularization, Lasso, and Ridge regression, then this video is for you. Enjoy, and I’m sure you’ll appreciate these concepts as much as I do!
Don’t forget to like, subscribe, and hit the bell icon for more deep dives into machine learning concepts. Thanks for watching!
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: