The purpose of the proposed research is to develop methodologies for the analysis of data with highly differential probabilities of inclusion by developing model-based methods that combine the robustness of the design approach with the optimality obtainable through a model. In unequal-probability-of-selection sample designs often found in national, population-based, complex-sample-design health surveys, correlations between the probability of selection and the sampled data can induce bias. Weights equal to the inverse of the probability of selection are often used to counteract this bias. Highly disproportional sample designs have large weights, which can introduce unnecessary variability in statistics such as the population mean estimate. Weight trimming reduces large weights to a fixed cutpoint value and adjusts weights below this value to maintain the untrimmed weight sum. This reduces variability at the cost of introducing some bias. Standard approaches are not """"""""data-driven"""""""": they do not use the data to make the appropriate bias-variance tradeoff, or else do so in a highly inefficient fashion. We propose to develop model-based methods for """"""""weight trimming"""""""" to supplement standard, ad-hoc design-based methods in disproportional probability-of-inclusion designs where variances due to sample weights exceeds bias correction. We will also consider the use of these models to estimate population parameters in linear and generalized linear regression models. We develop these models in the context of stratified and poststratified known-probability sample designs, and extend their use into more general multistage cluster sample designs. We plan to develop Bayesian models, as we believe that they offer both theoretical and practical advantages to traditional frequentist analyses and that their use has been neglected in the analysis of survey data. We will consider three major applications: analyses to determine predictors and mechanisms of injury to children in passenger vehicle crashes using the population-based surveillance dataset of the Partners for Child Passenger Safety, to explore the development of cardiovascular risk factors in children using the National Health and Nutrition Examination Survey and to determine the prevalence of smoking and cancer screening behavior among adults using calibration estimators developed to combine data from the Behavioral Risk Factor Surveillance Survey and the National Health Interview Survey.
Elliott, Michael R (2009) Model Averaging Methods for Weight Trimming in Generalized Linear Regression Models. J Off Stat 25:1-20 |
Elliott, Michael R (2008) Model Averaging Methods for Weight Trimming. J Off Stat 24:517-540 |