Abstract
We develop a novel framework to study smooth and strongly convex optimization algorithms. Focusing on quadratic functions we are able to examine optimization algorithms as a recursive application of linear operators. This, in turn, reveals a powerful connection between a class of optimization algorithms and the analytic theory of polynomials whereby new lower and upper bounds are derived. Whereas existing lower bounds for this setting are only valid when the dimensionality scales with the number of iterations, our lower bound holds in the natural regime where the dimensionality is fixed. Lastly, expressing it as an optimal solution for the corresponding optimization problem over polynomials, as formulated by our framework, we present a novel systematic derivation of Nesterov's well-known Accelerated Gradient Descent method. This rather natural interpretation of AGD contrasts with earlier ones which lacked a simple, yet solid, motivation.
Original language | English |
---|---|
Pages (from-to) | 1-51 |
Number of pages | 51 |
Journal | Journal of Machine Learning Research |
Volume | 17 |
State | Published - 1 Feb 2016 |
Bibliographical note
Publisher Copyright:©2016 Yossi Arjevani, Shai Shalev-Shwartz and Ohad Shamir.
Keywords
- Accelerated gradient descent
- Full gradient descent
- Heavy ball method
- Smooth and strongly convex optimization