This research involves development of imputation techniques and variance estimation methods for survey data with nonrespondents. The investigator focues on (1) validating and comparing (both theoretically and empirically) the existing imputation techniques and developing better procedures if necessary; and (2) developing correct variance estimators for a given imputation method that produces correct survey estimates. Special attention will be paid on random hot deck imputation using models, nearest neighbor imputation, cold deck imputation, multivariate imputation, longitudinal imputation, imputation for quantiles, and imputation for non-ignorable response. Many variance estimation techniques (such as the linearization/Taylor expansion, jackknife, balanced half sample or balanced repeated replication, random groups, and bootstrap) will be studied. Particular issues that will be addressed in variance estimation include non-negligible sampling fractions, approximation in applying replication methods (such as grouping and collapsing), complex and composite imputation methods (in the sense that a number of different imputation methods are used and/or imputed data are used to impute nonrespondents for other variables), variance estimation for nearest neighbor imputation, variance estimation for sample quantiles, and problems with imputed values that cannot be identified from the data set.
Most surveys have nonrespondents. Item nonresponse occurs when some sampled units cooperate in the survey but fail to provide answers to some questions. Commonly used compensation procedures for handling item nonresponse are imputation techniques which insert values for nonrespondents. It is a common practice to treat the imputed values as if they had been observed, and compute survey estimates and assess their varibility using standard formulas designed for the case of no nonresponse. This, however, could lead to some problems and biases in statistical analysis. For example, the use of standard formulas to assess varibility in analysis may seriously underestimate the true varibility, because standard formulas do not account for the changes in varibility due to nonresponse and/or imputation. This research involves development of correct and simple to implement statistical procedures to analyze survey data with nonrespondents and imputation; and will solve some real statistical problems in survey agencies such as the Census Bureau, the Bureau of Labor Statistics, and Westat.