Abstract
Multidimensional Scaling (MDS) is a classical technique for embedding data in low dimensions, still in widespread use today. In this paper we study MDS in a modern setting - specifically, high dimensions and ambient measurement noise. We show that as the ambient noise level increases, MDS suffers a sharp breakdown that depends on the data dimension and noise level, and derive an explicit formula for this breakdown point in the case of white noise. We then introduce MDS+, a simple variant of MDS, which applies a shrinkage nonlinearity to the eigenvalues of the MDS similarity matrix. Under a natural loss function measuring the embedding quality, we prove that MDS+ is the unique, asymptotically optimal shrinkage function. MDS+ offers improved embedding, sometimes significantly so, compared with MDS. Importantly, MDS+ calculates the optimal embedding dimension, into which the data should be embedded.
Original language | English |
---|---|
Pages (from-to) | 333-373 |
Number of pages | 41 |
Journal | Applied and Computational Harmonic Analysis |
Volume | 51 |
DOIs | |
State | Published - Mar 2021 |
Bibliographical note
Publisher Copyright:© 2020 The Authors
Keywords
- Dimensionality reduction
- Euclidean embedding
- MDS+
- Multidimensional scaling
- Optimal shrinkage
- Singular value thresholding