Optimal linear imputation with a convergence guarantee

Yehezkel S. Resheff*, Daphna Weinshall

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

It is a common occurrence in the field of data science that real-world datasets, especially when they are high dimensional, contain missing entries. Since most machine learning, data analysis, and statistical methods are not able to handle missing values gracefully, these must be filled in prior to the application of these methods. It is no surprise therefore that there has been a long standing interest in methods for imputation of missing values. One recent, popular, and effective approach, the IRMI stepwise regression imputation method, models each feature as a linear combination of all other features. A linear regression model is then computed for each real-valued feature on the basis of all other features in the dataset, and subsequent predictions are used as imputation values. However, the proposed iterative formulation lacks a convergence guarantee. Here we propose a closely related method, stated as a single optimization problem, and a block coordinate-descent solution which is guaranteed to converge to a local minimum. Experiment results on both synthetic and benchmark datasets are comparable to the results of the IRMI method whenever it converges. However, while in the set of experiments described here IRMI often diverges, the performance of our method is shown to be markedly superior in comparison to other methods.

Original languageEnglish
Title of host publicationPattern Recognition Applications and Methods - 6th International Conference, ICPRAM 2017, Revised Selected Papers
EditorsAna Fred, Maria De Marsico, Gabriella Sanniti di Baja
PublisherSpringer Verlag
Pages62-79
Number of pages18
ISBN (Print)9783319936468
DOIs
StatePublished - 2018
Event6th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2017 - Porto, Portugal
Duration: 24 Feb 201726 Feb 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10857 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2017
Country/TerritoryPortugal
CityPorto
Period24/02/1726/02/17

Bibliographical note

Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.

Fingerprint

Dive into the research topics of 'Optimal linear imputation with a convergence guarantee'. Together they form a unique fingerprint.

Cite this