Abstract
It is a common occurrence in the field of data science that real-world datasets, especially when they are high dimensional, contain missing entries. Since most machine learning, data analysis, and statistical methods are not able to handle missing values gracefully, these must be filled in prior to the application of these methods. It is no surprise therefore that there has been a long standing interest in methods for imputation of missing values. One recent, popular, and effective approach, the IRMI stepwise regression imputation method, models each feature as a linear combination of all other features. A linear regression model is then computed for each real-valued feature on the basis of all other features in the dataset, and subsequent predictions are used as imputation values. However, the proposed iterative formulation lacks a convergence guarantee. Here we propose a closely related method, stated as a single optimization problem, and a block coordinate-descent solution which is guaranteed to converge to a local minimum. Experiment results on both synthetic and benchmark datasets are comparable to the results of the IRMI method whenever it converges. However, while in the set of experiments described here IRMI often diverges, the performance of our method is shown to be markedly superior in comparison to other methods.
Original language | English |
---|---|
Title of host publication | Pattern Recognition Applications and Methods - 6th International Conference, ICPRAM 2017, Revised Selected Papers |
Editors | Ana Fred, Maria De Marsico, Gabriella Sanniti di Baja |
Publisher | Springer Verlag |
Pages | 62-79 |
Number of pages | 18 |
ISBN (Print) | 9783319936468 |
DOIs | |
State | Published - 2018 |
Event | 6th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2017 - Porto, Portugal Duration: 24 Feb 2017 → 26 Feb 2017 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 10857 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 6th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2017 |
---|---|
Country/Territory | Portugal |
City | Porto |
Period | 24/02/17 → 26/02/17 |
Bibliographical note
Publisher Copyright:© Springer International Publishing AG, part of Springer Nature 2018.