Abstract
The clustering of unlabeled raw images is a daunting task, which has recently been approached with some success by deep learning methods. Here we propose an unsupervised clustering framework, which learns a deep neural network in an end-to-end fashion, providing direct cluster assignments of images without additional processing. Multi-Modal Deep Clustering (MMDC), trains a deep network to align its image embeddings with target points sampled from a Gaussian Mixture Model distribution. The cluster assignments are then determined by mixture component association of image embeddings. Simultaneously, the same deep network is trained to solve an additional self-supervised task of predicting image rotations. This pushes the network to learn more meaningful image representations that facilitate a better clustering. Experimental results show that MMDC achieves or exceeds state-of-the-art performance on six challenging benchmarks. On natural image datasets we improve on previous results with significant margins of up to 20% absolute accuracy points, yielding an accuracy of 82% on CIFAR-10, 45% on CIFAR-100 and 69% on STL-10.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 4728-4735 |
| Number of pages | 8 |
| ISBN (Electronic) | 9781728188089 |
| DOIs | |
| State | Published - 2020 |
| Event | 25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Online, Italy Duration: 10 Jan 2021 → 15 Jan 2021 |
Publication series
| Name | Proceedings - International Conference on Pattern Recognition |
|---|---|
| ISSN (Print) | 1051-4651 |
Conference
| Conference | 25th International Conference on Pattern Recognition, ICPR 2020 |
|---|---|
| Country/Territory | Italy |
| City | Virtual, Online |
| Period | 10/01/21 → 15/01/21 |
Bibliographical note
Publisher Copyright:© 2020 IEEE
Fingerprint
Dive into the research topics of 'Multi-modal deep clustering: Unsupervised partitioning of images'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver