Health research relies on probability-based survey data for a wide range of outcomes, such as evaluating policy impact, quantifying disease prevalence rates and health conditions in the general population, identifying risk factors, assessing health disparities, and measuring changes over time. The need for accurate data on the U.S. population has never been greater. However, survey participation has been rapidly declining and methodological studies have shown substantial bias in some survey estimates because of nonresponse. Results based on multiple nonresponse bias studies indicate that adjustments for nonresponse using demographic characteristics, as in common practice, may not be sufficient. This study proposes the use of government administrative data that are related to health measures to improve weighting adjustments. In an era of ?big data? that has seen the use of combined data from multiple sources to expand the types of analyses, there is a missed opportunity to use administrative data for improving survey estimates rather than only augmenting data with analytic variables. Administrative data can be variable-poor, such as enrollment in a government health plan without any other data, but complete for the entire population; survey data are generally variable-rich, with a diverse set of demographic, factual, and behavioral survey measures, but can suffer from nonresponse and other survey errors. This study aims to leverage the population estimates from administrative data to improve inference from the survey data. There are three main impediments to using administrative data to correct for nonresponse. A key challenge results from measurement differences between the responses to the survey questions and the administrative data elements. A second hindrance arises from problems with data linkage across sources. A third obstacle is posed by the usual difference in target populations between surveys and administrative databases. The main objectives of this study are to implement a set of methods that overcome these challenges and to evaluate the approach's effectiveness to reduce nonresponse bias along with the effects on variance estimates and mean squared error. If successful, the proposed approach could lead to better utilization of available resources to improve health survey data. This study offers a test based on one survey and three administrative databases (from two sources), but if deemed effective, it could be applied to many other surveys using a variety of administrative data sources. The integrity of official statistics relies on accurate estimates, which in turn are reliant on the methods used to produce those estimates. Declining survey participation is a serious threat to results based on surveys and aims to contribute to the development of better methods to correct for nonresponse.

Public Health Relevance

Much of health research relies on probability-based survey data, yet participation in surveys has been rapidly declining. Evidence suggests that the resulting nonresponse bias can be substantial and that commonly used methods and auxiliary information to correct for nonresponse may be insufficient. This study aims to develop and evaluate the use of available administrative data from government agencies to improve adjustment for nonresponse.

National Institute of Health (NIH)
National Institute on Aging (NIA)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Karraker, Amelia Wilkes
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Research Triangle Institute
Res Triangle
United States
Zip Code