TY - JOUR
T1 - Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models
AU - Robins, James M.
AU - Ritov, Ya'acov
PY - 1997/1/15
Y1 - 1997/1/15
N2 - We argue, that due to the curse of dimensionality, there are major difficulties with any pure or smoothed likelihood-based method of inference in designed studies with randomly missing data when missingness depends on a high-dimensional vector of variables. We study in detail a semi-parametric superpopulation version of continuously stratified random sampling. We show that all estimators of the population mean that are uniformly consistent or that achieve an algebraic rate of convergence, no matter how slow, require the use of the selection (randomization) probabilities. We argue that, in contrast to likelihood methods which ignore these probabilities, inverse selection probability weighted estimators continue to perform well achieving uniform n( 1/4 )-rates of convergence. We propose a curse of dimensionality appropriate (CODA) asymptotic theory for inference in non- and semi-parametric models in an attempt to formalize our arguments. We discuss whether our results constitute a fatal blow to the likelihood principle and study the attitude toward these that a committed subjective Bayesian would adopt. Finally, we apply our CODA theory to analyse the effect of the 'curse of dimensionality' in several interesting semi-parametric models, including a model for a two-armed randomized trial with randomization probabilities depending on a vector of continuous pretreatment covariates X. We provide substantive settings under which a subjective Bayesian would ignore the randomization probabilities in analysing the trial data. We then show that any statistician who ignores the randomization probabilities is unable to construct nominal 95 per cent confidence intervals for the true treatment effect that have both: (i) an expected length which goes to zero with increasing sample size; and (ii) a guaranteed expected actual coverage rate of at least 95 per cent over the ensemble of trials analysed by the statistician during his or her lifetime. However, we derive a new interval estimator, depending on the Randomization probabilities, that satisfies (i) and (ii).
AB - We argue, that due to the curse of dimensionality, there are major difficulties with any pure or smoothed likelihood-based method of inference in designed studies with randomly missing data when missingness depends on a high-dimensional vector of variables. We study in detail a semi-parametric superpopulation version of continuously stratified random sampling. We show that all estimators of the population mean that are uniformly consistent or that achieve an algebraic rate of convergence, no matter how slow, require the use of the selection (randomization) probabilities. We argue that, in contrast to likelihood methods which ignore these probabilities, inverse selection probability weighted estimators continue to perform well achieving uniform n( 1/4 )-rates of convergence. We propose a curse of dimensionality appropriate (CODA) asymptotic theory for inference in non- and semi-parametric models in an attempt to formalize our arguments. We discuss whether our results constitute a fatal blow to the likelihood principle and study the attitude toward these that a committed subjective Bayesian would adopt. Finally, we apply our CODA theory to analyse the effect of the 'curse of dimensionality' in several interesting semi-parametric models, including a model for a two-armed randomized trial with randomization probabilities depending on a vector of continuous pretreatment covariates X. We provide substantive settings under which a subjective Bayesian would ignore the randomization probabilities in analysing the trial data. We then show that any statistician who ignores the randomization probabilities is unable to construct nominal 95 per cent confidence intervals for the true treatment effect that have both: (i) an expected length which goes to zero with increasing sample size; and (ii) a guaranteed expected actual coverage rate of at least 95 per cent over the ensemble of trials analysed by the statistician during his or her lifetime. However, we derive a new interval estimator, depending on the Randomization probabilities, that satisfies (i) and (ii).
UR - http://www.scopus.com/inward/record.url?scp=0031016404&partnerID=8YFLogxK
U2 - 10.1002/(sici)1097-0258(19970215)16:3<285::aid-sim535>3.0.co;2-%23
DO - 10.1002/(sici)1097-0258(19970215)16:3<285::aid-sim535>3.0.co;2-%23
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 9004398
AN - SCOPUS:0031016404
SN - 0277-6715
VL - 16
SP - 285
EP - 319
JO - Statistics in Medicine
JF - Statistics in Medicine
IS - 1-3
ER -