Incomplete data frequently are encountered in survey data due to nonresponse, inaccurate measurement, and two-phase sampling, among other things. Any form of incomplete data can damage the representativeness of a sample, and a naive analysis with incomplete data can lead to biased estimation. Imputation is a process of assigning values to the missing items with the objective of reducing bias and improving the efficiency of the resulting estimators. This project will develop fractional imputation methods as a tool for handling incomplete data for general-purpose estimation. These methods will serve as important building blocks for the establishment of a complete statistical package for analysis of incomplete data that ultimately can be applied to problems in a variety of disciplines, including the social, behavioral, and economic sciences. In particular, the project will develop fractional imputation to address several important problems with incomplete data, including (1) likelihood-based inference, (2) robust estimation using fractional hot deck imputation, (3) synthetic imputation for survey integration, and (4) statistical matching technique.
Because fractional imputation is a relatively new approach for handling incomplete data, there is a critical need for theoretical and methodological development. The advantages of the fractional imputation approach lie in its computational simplicity, wide applicability, and its statistical validity. By using fractional weights, fractional imputation avoids the burden of iterative computation, such as Markov Chain Monte Carlo, for the evaluation of conditional expectation associated with missing data. The proposed approach can be used to estimate parameters consistently and efficiently. The fractional imputation approach can be applied to nonstandard situations such as measurement error models, regression analysis combining two different surveys, and causal inference from observational studies. The impact of the proposed research is therefore substantial because the proposed approach can be used as a general methodology for incomplete data. Because of the computational simplicity and statistical validity of the fractional imputation approach, the results of the proposed research should have wide applicability. It also should have a major impact in providing complete data sets for analysis and new data products combining information from different surveys.