Accounting for Non-ignorable Sampling and Non-response in Statistical Matching

Daniela Marella*, Danny Pfeffermann

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Data for statistical analysis is often available from different samples, with each sample containing measurements on only some of the variables of interest. Statistical matching attempts to generate a fused database containing matched measurements on all the target variables. In this article, we consider the use of statistical matching when the samples are drawn by informative sampling designs and are subject to not missing at random non-response. The problem with ignoring the sampling process and non-response is that the distribution of the data observed for the responding units can be very different from the distribution holding for the population data, which may distort the inference process and result in a matched database that misrepresents the joint distribution in the population. Our proposed methodology employs the empirical likelihood approach and is shown to perform well in a simulation experiment and when applied to real sample data.

Original languageEnglish
Pages (from-to)269-293
Number of pages25
JournalInternational Statistical Review
Volume91
Issue number2
DOIs
StatePublished - Aug 2023

Bibliographical note

Publisher Copyright:
© 2022 The Authors. International Statistical Review published by John Wiley & Sons Ltd on behalf of International Statistical Institute.

Keywords

  • empirical likelihood
  • fusion
  • IPF algorithm
  • matching uncertainty
  • NMAR non-response
  • sample and respondents distributions

Fingerprint

Dive into the research topics of 'Accounting for Non-ignorable Sampling and Non-response in Statistical Matching'. Together they form a unique fingerprint.

Cite this