Minimizing the maximal loss: How and why

Shai Shalev-Shwartz, Yonatan Wexler

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Scopus citations

Abstract

A commonly used learning rule is to approximately minimize the average loss over the training set. Other learning algorithms, such as AdaBoost and hard-SVM, aim at minimizing the maximal loss over the training set. The average loss is more popular, particularly in deep learning, due to three main reasons. First, it can be conveniently minimized using online algorithms, that process few examples at each iteration. Second, it is often argued that there is no sense to minimize the loss on the training set too much, as it will not be reflected in the generalization loss. Last, the maximal loss is not robust to outliers. In this paper we describe and analyze an algorithm that can convert any online algorithm to a minimizer of the maximal loss. We prove that in some situations better accuracy on the training set is crucial to obtain good performance on unseen examples. Last, we propose robust versions of the approach that can handle outliers.

Original languageEnglish
Title of host publication33rd International Conference on Machine Learning, ICML 2016
EditorsMaria Florina Balcan, Kilian Q. Weinberger
PublisherInternational Machine Learning Society (IMLS)
Pages1253-1265
Number of pages13
ISBN (Electronic)9781510829008
StatePublished - 2016
Event33rd International Conference on Machine Learning, ICML 2016 - New York City, United States
Duration: 19 Jun 201624 Jun 2016

Publication series

Name33rd International Conference on Machine Learning, ICML 2016
Volume2

Conference

Conference33rd International Conference on Machine Learning, ICML 2016
Country/TerritoryUnited States
CityNew York City
Period19/06/1624/06/16

Fingerprint

Dive into the research topics of 'Minimizing the maximal loss: How and why'. Together they form a unique fingerprint.

Cite this