Oracle complexity of second-order methods for finite-sum problems

Yossi Arjevani, Ohad Shamir

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in second-order methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer - perhaps surprisingly - is negative, at least in terms of worst-case guarantees. We also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.

Original languageAmerican English
Title of host publication34th International Conference on Machine Learning, ICML 2017
PublisherInternational Machine Learning Society (IMLS)
Pages274-297
Number of pages24
ISBN (Electronic)9781510855144
StatePublished - 2017
Externally publishedYes
Event34th International Conference on Machine Learning, ICML 2017 - Sydney, Australia
Duration: 6 Aug 201711 Aug 2017

Publication series

Name34th International Conference on Machine Learning, ICML 2017
Volume1

Conference

Conference34th International Conference on Machine Learning, ICML 2017
Country/TerritoryAustralia
CitySydney
Period6/08/1711/08/17

Bibliographical note

Publisher Copyright:
© 2017 International Machine Learning Society (IMLS). All rights reserved.

Fingerprint

Dive into the research topics of 'Oracle complexity of second-order methods for finite-sum problems'. Together they form a unique fingerprint.

Cite this