Domain adaptation-can quantity compensate for quality?

Shai Ben-David, Shai Shalev-Shwartz, Ruth Urner

Research output: Contribution to conferencePaperpeer-review

13 Scopus citations

Abstract

The Domain Adaptation problem in machine learning occurs when the distribution generating the test data differs from the one that generates the training data. A common approach to this issue is to train a standard learner for the learning task with the available training sample (generated by a distribution that is different from the test distribution). In this work we address this approach, investigating whether there exist successful learning methods for which learning of a target task can be achieved by substituting the standard target-distribution generated sample by a (possibly larger) sample generated by a different distribution without worsening the error guarantee on the learned classifier. We give a positive answer, showing that this is possible when using a Nearest Neighbor algorithm.We show this under the assumptions of covariate shift as well as a bound on the ratio of the probability weights between the source (training) and target (test) distribution. We further show that these assumptions are not always sufficient to allow such a replacement of the training sample: For proper learning, where the output classifier has to come from a predefined class, we prove that any learner needs access to data generated from the target distribution.

Original languageEnglish
StatePublished - 2012
EventInternational Symposium on Artificial Intelligence and Mathematics, ISAIM 2012 - Fort Lauderdale, FL, United States
Duration: 9 Jan 201211 Jan 2012

Conference

ConferenceInternational Symposium on Artificial Intelligence and Mathematics, ISAIM 2012
Country/TerritoryUnited States
CityFort Lauderdale, FL
Period9/01/1211/01/12

Fingerprint

Dive into the research topics of 'Domain adaptation-can quantity compensate for quality?'. Together they form a unique fingerprint.

Cite this