Learning linear and kernel predictors with the 0-1 loss function

Shai Shalev-Shwartz, Ohad Shamir, Karthik Sridharan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Some of the most successful machine learning algorithms, such as Support Vector Machines, are based on learning linear and kernel predictors with respect to a convex loss function, such as the hinge loss. For classification purposes, a more natural loss function is the 0-1 loss. However, using it leads to a non-convex problem for which there is no known efficient algorithm. In this paper, we describe and analyze a new algorithm for learning linear or kernel predictors with respect to the 0-1 loss function. The algorithm is parameterized by L, which quantifies the effective width around the decision boundary in which the predictor may be uncertain. We show that without any distributional assumptions, and for any fixed L, the algorithm runs in polynomial time, and learns a classifier which is worse than the optimal such classifier by at most ε. We also prove a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn such classifiers in time polynomial in L.

Original languageEnglish
Title of host publicationIJCAI 2011 - 22nd International Joint Conference on Artificial Intelligence
Pages2740-2745
Number of pages6
DOIs
StatePublished - 2011
Event22nd International Joint Conference on Artificial Intelligence, IJCAI 2011 - Barcelona, Catalonia, Spain
Duration: 16 Jul 201122 Jul 2011

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference22nd International Joint Conference on Artificial Intelligence, IJCAI 2011
Country/TerritorySpain
CityBarcelona, Catalonia
Period16/07/1122/07/11

Fingerprint

Dive into the research topics of 'Learning linear and kernel predictors with the 0-1 loss function'. Together they form a unique fingerprint.

Cite this