Stochastic Gradient Descent¶
Models that optimize over a manifold obtain the gradient using stochastic gradient descent (SGD).
The SGD method implemented by H2O is:
References¶
Niu, Feng, et al. “Hogwild!: A lock-free approach to parallelizing stochastic gradient descent.” Advances in Neural Information Processing Systems 24 (2011): 693-701.*implemented algorithm on p.5 http://www.eecs.berkeley.edu/~brecht/papers/hogwildTR.pdf