A closer examination of three small-sample approximations to the multiple-imputation degrees of freedom

Incomplete data is a common complication in applied research. In this study, we use simulation to compare two approaches to the multiple imputation of a continuous predictor: multiple imputation through chained equations and multivariate normal imputation. This study extends earlier work by being the first to 1) compare the small-sample approximations to the multiple-imputation degrees of freedom proposed by Barnard and Rubin (1999, Biometrika 86: 948–955); Lipsitz, Parzen, and Zhao (2002, Journal of Statistical Computation and Simulation 72: 309–318); and Reiter (2007, Biometrika 94: 502–508) and 2) ask if the sampling distribution of the t statistics is in fact a Student’s t distribution with the specified degrees of freedom. In addition to varying the imputation method, we varied the number of imputations (m = 5, 10, 20, 100) that were averaged over 500,000 replications to obtain the combined estimates and standard errors for a linear model that regressed the log price of a home on its age (years) and size (square feet) in a sample of 25 observations. Six age values were randomly set equal to missing for each replication. As assessed by the absolute percentage and relative percentage bias, the two approaches performed similarly. The absolute bias of the regression coefficients for age and size was roughly −0.1% across the levels of m for both approaches; the absolute bias for the constant was 0.6% for the chained-equations approach and 1.0% for the multivariate normal model. The absolute biases of the standard errors for age, size, and the constant were 0.2%, 0.3%, and 1.2%, respectively. In general, the relative percentage bias was slightly smaller for the chained-equations approach. Graphical and numerical inspection of the empirical sampling distributions for the three t statistics suggested that the area from the shoulder to the tail was reasonably well approximated by a t distribution and that the small-sample approximations to the multiple-imputation degrees of freedom proposed by Barnard and Rubin and by Reiter performed satisfactorily.

Issue Date:
Publication Type:
Journal Article
DOI and Other Identifiers:
st0235 (Other)
PURL Identifier:
Published in:
Stata Journal, Volume 11, Number 3
Page range:
Total Pages:
JEL Codes:
missing data; multiple imputation; small-sample degrees of freedom

Record appears in:

 Record created 2017-04-01, last modified 2017-04-28

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)