Testing Independence Under Biased Sampling

Yaniv Tenzer*, Micha Mandel, Or Zuk

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Testing for dependence between pairs of random variables is a fundamental problem in statistics. In some applications, data are subject to selection bias that can create spurious dependence. An important example is truncation models, in which observed pairs are restricted to a specific subset of the X-Y plane. Standard tests for independence are not suitable in such cases, and alternative tests that take the selection bias into account are required. Here, we generalize the notion of quasi-independence with respect to the sampling mechanism, and study the problem of detecting any deviations from it. We develop two tests statistics motivated by the classic Hoeffding’s statistic, and use two approaches to compute their distribution under the null: (i) a bootstrap-based approach, and (ii) a permutation-test with nonuniform probability of permutations. We also handle an important application to the case of censoring with truncation, by estimating the biased sampling mechanism from the data. We prove the validity of the tests, and show, using simulations, that they improve power compared to competing methods for important special cases. The tests are applied to four datasets, two that are subject to truncation, with and without censoring, and two to bias mechanisms related to length bias.

Original languageAmerican English
Pages (from-to)2194-2206
Number of pages13
JournalJournal of the American Statistical Association
Volume117
Issue number540
DOIs
StatePublished - 2022

Bibliographical note

Publisher Copyright:
© 2021 American Statistical Association.

Keywords

  • Markov chain Monte Carlo
  • Permutation test
  • Quasi-independence
  • Truncation
  • Weighted distribution

Fingerprint

Dive into the research topics of 'Testing Independence Under Biased Sampling'. Together they form a unique fingerprint.

Cite this