Optimizers arer algorithms or methods used to change the attributes of the neural network suh as weights and learning rate to reduce the losses


The Main Idea of Fitting a Line to Data A.K.A Least Square and Linear Regression

When we fit a line with linear Regression we optimize the intercept and slope, when we use Logistics Regression we optimize a squiggle, When we use t-SNE we optimize clusters.

Gradient Descent:

Gradient Descent identifies the optimal values by taking big steps when its far away and baby steps when its closed

<aside> 💡 if we were using the Least squares to solve for the optimal value for the intercept, we would simply find where the slope of the curve is 0, yet in gradient descent finds the minimum vlaues by taking steps from an intial guess until it reahes the best value

</aside>

Gradient Descent is very useful when it is not possible to solve for where the derivative = 0, and this is why gradient descent can be used in so many different situation

Untitled