## Abstract

The difficulty of multi-class classification generally increases with the number of classes. Using data for a small set of the classes, can we predict how well the classifier scales as the number of classes increases? We propose a framework for studying this question, assuming that classes in both sets are sampled from the same population and that the classifier is based on independently learned scoring functions. Under this framework, we can express the classification accuracy on a set of k classes as the (k−1)st moment of a discriminability function; the discriminability function itself does not depend on k. We leverage this result to develop a non-parametric regression estimator for the discriminability function, which can extrapolate accuracy results to larger unobserved sets. We also formalize an alternative approach that extrapolates accuracy separately for each class, and identify tradeoffs between the two methods. We show that both methods can accurately predict classifier performance on label sets up to ten times the size of the original set, both in simulations as well as in realistic face recognition or character recognition tasks.

Original language | English |
---|---|

Article number | 65 |

Number of pages | 30 |

Journal | Journal of Machine Learning Research |

Volume | 19 |

State | Published - 1 Oct 2018 |

### Bibliographical note

Funding Information:We thank Jonathan Taylor, Trevor Hastie, John Duchi, Steve Mussmann, Qingyun Sun, Robert Tibshirani, Patrick McClure, Francisco Pereira, and Gal Elidan for useful discussion. CZ is supported by an NSF graduate research fellowship, and would also like to thank the European Research Council under the ERC grant agreement n◦[PSARPS-294519] for travel support. We would also like thank the anonymous reviewers for their comments, which improved the readability of the paper.

Publisher Copyright:

© 2018 Charles Zheng, Rakesh Achanta, and Yuval Benjamini.

## Keywords

- Face recognition
- Multi-class problems
- Nonparametric models
- Object recognition
- Transfer learning