Multiple Imputation: Research for the Third Decade DMS-9705158 Donald B. Rubin and John Barnard Harvard University The third decade of multiple imputation begins with ever-growing applications, even into areas not originally proposed (e.g., environmental studies, chemistry), an increasing availability of software, and an increasing amount of statistical research work being conducted. The expanded, almost routine, use of multiple imputation means that many technical issues, considered relatively minor in its early development, need attention. This research addresses several of the most important issues, in particular: (1) creating multiple imputations under more general and flexible models; (2) creating "nested" multiple imputations in data sets with large and highly variable fractions of missing information; (3) using cross-match techniques to analyze data sets having only a few multiple imputations but large fractions of missing information, especially for obtaining valid p-values; (4) analyzing data sets with nested multiple imputations; and (5) conducting exploratory and diagnostic analyses on multiply-imputed data sets. Missing values are prevalent in many data sets and can be a great hindrance in making inference. Multiple imputation has proven to be a useful mode of inference in the presence of missing data. The basic aim of multiple imputation is to allow users of incomplete data sets, who typically have little information about the missing-data mechanism, to reach valid statistical inferences using only (1) standard complete-data analysis techniques and (2) simple rules for combining the output of the complete-data analyses. Multiple imputation has been successfully used in many contexts including in the social and economic sciences, the history of science, and in biomedical applications. This effort extends the applicability and the ease of use of multiple imputation, which enters its third decade where it should become an important tool in everyday statistical practice.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
9705158
Program Officer
Joseph M. Rosenblatt
Project Start
Project End
Budget Start
1997-08-01
Budget End
2000-07-31
Support Year
Fiscal Year
1997
Total Cost
$240,492
Indirect Cost
Name
Harvard University
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138