Multiple Imputation: Research for the Third Decade DMS-9705158 Donald B. Rubin and John Barnard Harvard University The third decade of multiple imputation begins with ever-growing applications, even into areas not originally proposed (e.g., environmental studies, chemistry), an increasing availability of software, and an increasing amount of statistical research work being conducted. The expanded, almost routine, use of multiple imputation means that many technical issues, considered relatively minor in its early development, need attention. This research addresses several of the most important issues, in particular: (1) creating multiple imputations under more general and flexible models; (2) creating "nested" multiple imputations in data sets with large and highly variable fractions of missing information; (3) using cross-match techniques to analyze data sets having only a few multiple imputations but large fractions of missing information, especially for obtaining valid p-values; (4) analyzing data sets with nested multiple imputations; and (5) conducting exploratory and diagnostic analyses on multiply-imputed data sets. Missing values are prevalent in many data sets and can be a great hindrance in making inference. Multiple imputation has proven to be a useful mode of inference in the presence of missing data. The basic aim of multiple imputation is to allow users of incomplete data sets, who typically have little information about the missing-data mechanism, to reach valid statistical inferences using only (1) standard complete-data analysis techniques and (2) simple rules for combining the output of the complete-data analyses. Multiple imputation has been successfully used in many contexts including in the social and economic sciences, the history of science, and in biomedical applications. This effort extends the applicability and the ease of use of multiple imputation, which enters its third decade where it should become an important tool in everyday statistical practice.