Multidimensional scaling of noisy high dimensional data

Erez Peterfreund*, Matan Gavish

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Multidimensional Scaling (MDS) is a classical technique for embedding data in low dimensions, still in widespread use today. In this paper we study MDS in a modern setting - specifically, high dimensions and ambient measurement noise. We show that as the ambient noise level increases, MDS suffers a sharp breakdown that depends on the data dimension and noise level, and derive an explicit formula for this breakdown point in the case of white noise. We then introduce MDS+, a simple variant of MDS, which applies a shrinkage nonlinearity to the eigenvalues of the MDS similarity matrix. Under a natural loss function measuring the embedding quality, we prove that MDS+ is the unique, asymptotically optimal shrinkage function. MDS+ offers improved embedding, sometimes significantly so, compared with MDS. Importantly, MDS+ calculates the optimal embedding dimension, into which the data should be embedded.

Original languageAmerican English
Pages (from-to)333-373
Number of pages41
JournalApplied and Computational Harmonic Analysis
Volume51
DOIs
StatePublished - Mar 2021

Bibliographical note

Publisher Copyright:
© 2020 The Authors

Keywords

  • Dimensionality reduction
  • Euclidean embedding
  • MDS+
  • Multidimensional scaling
  • Optimal shrinkage
  • Singular value thresholding

Fingerprint

Dive into the research topics of 'Multidimensional scaling of noisy high dimensional data'. Together they form a unique fingerprint.

Cite this