Optimal shrinkage of singular values under random data contamination

Danny Barash, Matan Gavish

Research output: Contribution to journalConference articlepeer-review

Abstract

A low rank matrix X has been contaminated by uniformly distributed noise, missing values, outliers and corrupt entries. Reconstruction of X from the singular values and singular vectors of the contaminated matrix Y is a key problem in machine learning, computer vision and data science. In this paper, we show that common contamination models (including arbitrary combinations of uniform noise, missing values, outliers and corrupt entries) can be described efficiently using a single framework. We develop an asymptotically optimal algorithm that estimates X by manipulation of the singular values of Y, which applies to any of the contamination models considered. Finally, we find an explicit signal-to-noise cutoff, below which estimation of X from the singular value decomposition of Y must fail, in a well-defined sense.

Original languageAmerican English
Pages (from-to)6161-6171
Number of pages11
JournalAdvances in Neural Information Processing Systems
Volume2017-December
StatePublished - 2017
Event31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States
Duration: 4 Dec 20179 Dec 2017

Bibliographical note

Publisher Copyright:
© 2017 Neural information processing systems foundation. All rights reserved.

Fingerprint

Dive into the research topics of 'Optimal shrinkage of singular values under random data contamination'. Together they form a unique fingerprint.

Cite this