Abstract
We describe and analyze a new algorithm for agnostically learning kernel-based halfspaces with respect to the 0-1 loss function. Unlike most of the previous formulations, which rely on surrogate convex loss functions (e.g., hinge-loss in support vector machines (SVMs) and log-loss in logistic regression), we provide finite time/sample guarantees with respect to the more natural 0-1 loss function. The proposed algorithm can learn kernel-based halfspaces in worst-case time poly(exp(Llog(L/ε))), for any distribution, where L is a Lipschitz constant (which can be thought of as the reciprocal of the margin), and the learned classifier is worse than the optimal halfspace by at most ε. We also prove a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn kernel-based halfspaces in time polynomial in L.
Original language | American English |
---|---|
Pages (from-to) | 1623-1646 |
Number of pages | 24 |
Journal | SIAM Journal on Computing |
Volume | 40 |
Issue number | 6 |
DOIs | |
State | Published - 2011 |
Keywords
- Kernel methods
- Learning halfspaces
- Learning theory