Difference between revisions of "Gradient descent"
Jump to navigation
Jump to search
Zeno Gantner (talk | contribs) m |
Zeno Gantner (talk | contribs) |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | '''Gradient descent''' ('''GD''') is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. Stochastic gradient descent ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models. | + | '''Gradient descent''' ('''GD''') is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. '''Stochastic gradient descent''' ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[Recommender System|recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models. |
== External links == | == External links == | ||
* [[Wikipedia: Gradient descent]] | * [[Wikipedia: Gradient descent]] | ||
− | [[Category: | + | [[Category:Method]] |
Latest revision as of 13:18, 6 June 2011
Gradient descent (GD) is a general optimization algorithm, can be used to find a (possibly local) minimum of a differentiable function. Stochastic gradient descent (SGD) performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In recommender systems, methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. matrix factorization models.