Stochastic Gradient Descent

Models that optimize over a manifold obtain the gradient using stochastic gradient descent (SGD).

The SGD method implemented by H2O is:

../_images/hogwild.png

References

Niu, Feng, et al. “Hogwild!: A lock-free approach to parallelizing stochastic gradient descent.” Advances in Neural Information Processing Systems 24 (2011): 693-701.*implemented algorithm on p.5 http://www.eecs.berkeley.edu/~brecht/papers/hogwildTR.pdf