Abstract
The Domain Adaptation problem in machine learning occurs when the distribution generating the test data differs from the one that generates the training data. A common approach to this issue is to train a standard learner for the learning task with the available training sample (generated by a distribution that is different from the test distribution). In this work we address this approach, investigating whether there exist successful learning methods for which learning of a target task can be achieved by substituting the standard target-distribution generated sample by a (possibly larger) sample generated by a different distribution without worsening the error guarantee on the learned classifier. We give a positive answer, showing that this is possible when using a Nearest Neighbor algorithm.We show this under the assumptions of covariate shift as well as a bound on the ratio of the probability weights between the source (training) and target (test) distribution. We further show that these assumptions are not always sufficient to allow such a replacement of the training sample: For proper learning, where the output classifier has to come from a predefined class, we prove that any learner needs access to data generated from the target distribution.
Original language | English |
---|---|
State | Published - 2012 |
Event | International Symposium on Artificial Intelligence and Mathematics, ISAIM 2012 - Fort Lauderdale, FL, United States Duration: 9 Jan 2012 → 11 Jan 2012 |
Conference
Conference | International Symposium on Artificial Intelligence and Mathematics, ISAIM 2012 |
---|---|
Country/Territory | United States |
City | Fort Lauderdale, FL |
Period | 9/01/12 → 11/01/12 |