The surprising effectiveness of linear unsupervised image-to-image translation

Eitan Richardson, Yair Weiss

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Unsupervised image-to-image translation is an inherently ill-posed problem. Recent methods based on deep encoder-decoder architectures have shown impressive results, but we show that they only succeed due to a strong locality bias, and they fail to learn very simple nonlocal transformations (e.g. mapping upside down faces to upright faces). When the locality bias is removed, the methods are too powerful and may fail to learn simple local transformations. In this paper we introduce linear encoder-decoder architectures for unsupervised image to image translation. We show that learning is much easier and faster with these architectures and yet the results are surprisingly effective. In particular, we show a number of local problems for which the results of the linear methods are comparable to those of state-of-the-art architectures but with a fraction of the training time, and a number of nonlocal problems for which the state-of-the-art fails while linear methods succeed.

Original languageEnglish
Title of host publicationProceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7855-7861
Number of pages7
ISBN (Electronic)9781728188089
DOIs
StatePublished - 2020
Event25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Milan, Italy
Duration: 10 Jan 202115 Jan 2021

Publication series

NameProceedings - International Conference on Pattern Recognition
ISSN (Print)1051-4651

Conference

Conference25th International Conference on Pattern Recognition, ICPR 2020
Country/TerritoryItaly
CityVirtual, Milan
Period10/01/2115/01/21

Bibliographical note

Publisher Copyright:
© 2020 IEEE

Fingerprint

Dive into the research topics of 'The surprising effectiveness of linear unsupervised image-to-image translation'. Together they form a unique fingerprint.

Cite this