TY - GEN
T1 - Learning linear and kernel predictors with the 0-1 loss function
AU - Shalev-Shwartz, Shai
AU - Shamir, Ohad
AU - Sridharan, Karthik
PY - 2011
Y1 - 2011
N2 - Some of the most successful machine learning algorithms, such as Support Vector Machines, are based on learning linear and kernel predictors with respect to a convex loss function, such as the hinge loss. For classification purposes, a more natural loss function is the 0-1 loss. However, using it leads to a non-convex problem for which there is no known efficient algorithm. In this paper, we describe and analyze a new algorithm for learning linear or kernel predictors with respect to the 0-1 loss function. The algorithm is parameterized by L, which quantifies the effective width around the decision boundary in which the predictor may be uncertain. We show that without any distributional assumptions, and for any fixed L, the algorithm runs in polynomial time, and learns a classifier which is worse than the optimal such classifier by at most ε. We also prove a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn such classifiers in time polynomial in L.
AB - Some of the most successful machine learning algorithms, such as Support Vector Machines, are based on learning linear and kernel predictors with respect to a convex loss function, such as the hinge loss. For classification purposes, a more natural loss function is the 0-1 loss. However, using it leads to a non-convex problem for which there is no known efficient algorithm. In this paper, we describe and analyze a new algorithm for learning linear or kernel predictors with respect to the 0-1 loss function. The algorithm is parameterized by L, which quantifies the effective width around the decision boundary in which the predictor may be uncertain. We show that without any distributional assumptions, and for any fixed L, the algorithm runs in polynomial time, and learns a classifier which is worse than the optimal such classifier by at most ε. We also prove a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn such classifiers in time polynomial in L.
UR - http://www.scopus.com/inward/record.url?scp=84881060988&partnerID=8YFLogxK
U2 - 10.5591/978-1-57735-516-8/IJCAI11-456
DO - 10.5591/978-1-57735-516-8/IJCAI11-456
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84881060988
SN - 9781577355120
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 2740
EP - 2745
BT - IJCAI 2011 - 22nd International Joint Conference on Artificial Intelligence
T2 - 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011
Y2 - 16 July 2011 through 22 July 2011
ER -