Stochastic Gradient Descent Tricks¶
Authors: Léon Bottou
Published: 2012 (Conference Paper)
Source: Lecture Notes in Computer Science
Algorithm: SGD
DOI: 10.1007/978-3-642-35289-8_25
Summary¶
Abstract¶
Chapter 1 strongly advocates the stochastic back-propagation method to train neural networks. This is in fact an instance of a more general technique called stochastic gradient descent (SGD). This chapter provides background material, explains why SGD is a good learning algorithm when the training set is large, and provides useful recommendations.