Learning kernel-based halfspaces with the 0-1 loss

Shai Shalev-Shwartz*, Ohad Shamir, Karthik Sridharan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

33 Scopus citations

Abstract

We describe and analyze a new algorithm for agnostically learning kernel-based halfspaces with respect to the 0-1 loss function. Unlike most of the previous formulations, which rely on surrogate convex loss functions (e.g., hinge-loss in support vector machines (SVMs) and log-loss in logistic regression), we provide finite time/sample guarantees with respect to the more natural 0-1 loss function. The proposed algorithm can learn kernel-based halfspaces in worst-case time poly(exp(Llog(L/ε))), for any distribution, where L is a Lipschitz constant (which can be thought of as the reciprocal of the margin), and the learned classifier is worse than the optimal halfspace by at most ε. We also prove a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn kernel-based halfspaces in time polynomial in L.

Original languageEnglish
Pages (from-to)1623-1646
Number of pages24
JournalSIAM Journal on Computing
Volume40
Issue number6
DOIs
StatePublished - 2011

Keywords

  • Kernel methods
  • Learning halfspaces
  • Learning theory

Fingerprint

Dive into the research topics of 'Learning kernel-based halfspaces with the 0-1 loss'. Together they form a unique fingerprint.

Cite this