https://en.wikipedia.org/wiki/Stochastic_gradient_descent#AdaGrad
https://en.wikipedia.org/wiki/Stochastic_gradient_descent#AdaGrad