• l.cristian

Stochastic Gradient Descent update and the intuition behind it

Training a neural network involves some form of optimisation (minimisation) of an objective function. The root strategy from which all modern optimisers evolved is called SGD, the abbreviation of stochastic gradient descent.

The basic rule of it is quite simple, but the intuition of why the formula is the way it is, or why does it work, is challenging to get.

This post is an attempt to give you one intuition on the mechanics of this function.

See full article...

9 views0 comments

Subscribe Form