TY - GEN
T1 - Semi-supervised learning in gigantic image collections
AU - Fergus, Rob
AU - Weiss, Yair
AU - Torralba, Antonio
PY - 2009
Y1 - 2009
N2 - With the advent of the Internet it is now possible to collect hundreds of millions of images. These images come with varying degrees of label information. "Clean labels" can be manually obtained on a small fraction, "noisy labels" may be extracted automatically from surrounding text, while for most images there are no labels at all. Semi-supervised learning is a principled framework for combining these different label sources. However, it scales polynomially with the number of images, making it impractical for use on gigantic collections with hundreds of millions of images and thousands of classes. In this paper we show how to utilize recent results in machine learning to obtain highly efficient approximations for semi-supervised learning that are linear in the number of images. Specifically, we use the convergence of the eigenvectors of the normalized graph Laplacian to eigenfunctions of weighted Laplace-Beltrami operators. Our algorithm enables us to apply semi-supervised learning to a database of 80 million images gathered from the Internet.
AB - With the advent of the Internet it is now possible to collect hundreds of millions of images. These images come with varying degrees of label information. "Clean labels" can be manually obtained on a small fraction, "noisy labels" may be extracted automatically from surrounding text, while for most images there are no labels at all. Semi-supervised learning is a principled framework for combining these different label sources. However, it scales polynomially with the number of images, making it impractical for use on gigantic collections with hundreds of millions of images and thousands of classes. In this paper we show how to utilize recent results in machine learning to obtain highly efficient approximations for semi-supervised learning that are linear in the number of images. Specifically, we use the convergence of the eigenvectors of the normalized graph Laplacian to eigenfunctions of weighted Laplace-Beltrami operators. Our algorithm enables us to apply semi-supervised learning to a database of 80 million images gathered from the Internet.
UR - http://www.scopus.com/inward/record.url?scp=77955655063&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:77955655063
SN - 9781615679119
T3 - Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference
SP - 522
EP - 530
BT - Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference
PB - Neural Information Processing Systems
T2 - 23rd Annual Conference on Neural Information Processing Systems, NIPS 2009
Y2 - 7 December 2009 through 10 December 2009
ER -