Abstract
We propose a new method to estimate Wasserstein distances and optimal transport plans between two probability distributions from samples in high dimension. Unlike plug-in rules that simply replace the true distri-butions by their empirical counterparts, our method promotes couplings with low trans-port rank, a new structural assumption that is similar to the nonnegative rank of a ma-trix. Regularizing based on this assump-tion leads to drastic improvements on high-dimensional data for various tasks, includ-ing domain adaptation in single-cell RNA sequencing data. These findings are sup-ported by a theoretical analysis that indicates that the transport rank is key in overcoming the curse of dimensionality inherent to data-driven optimal transport.
| Original language | English |
|---|---|
| Pages (from-to) | 2454-2465 |
| Number of pages | 12 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 89 |
| State | Published - 2019 |
| Externally published | Yes |
| Event | 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019 - Naha, Japan Duration: 16 Apr 2019 → 18 Apr 2019 |
Bibliographical note
Publisher Copyright:© 2019 by the author(s).