Survey data collection represents a significant expenditure for the federal government. Ten federal statistical agencies alone spent more than $1.3 billion on surveys in 2004. Many of these government-funded survey research projects are essential for advancing scientific knowledge about the health of the United States population; examples include the National Health Interview Study (NHIS), the National Survey of Family Growth (NSFG), the National Survey of Drug Use and Health (NSDUH), and the Health and Retirement Study (HRS). Hundreds of smaller government-funded research projects in the health sciences also rely on survey data collection. Unfortunately, government resources available for continuing and improving these surveys are becoming scarcer, making it essential that researchers from the wide range of disciplines that collect survey data, and especially the health sciences, are using these federal resources efficiently. Growing nonresponse and increasing reliance on less expensive modes of data collection have made planning surveys more difficult, and the great deal of uncertainty associated with survey data collection makes it difficult to efficiently minimize survey errors and survey costs. In response to these problems, survey methodologists have developed a conceptual framework for survey data collection known as responsive survey design (RSD), which is a principled, scientific method for decreasing survey errors and survey costs. Many survey projects, including the surveys mentioned above, are now saving money and better maximizing available resources by taking advantage of RSD concepts. Unfortunately, the analytic techniques used to implement RSDs to date have been extremely simplistic, and no coherent analytic frameworks for RSD have been proposed in the RSD literature. The Bayesian analysis framework represents a natural fit for RSDs, given its ability to update prior beliefs about uncertain parameters and outcomes using current information. Surprisingly, this framework has not yet been applied in a rigorous fashion to the RSD practices that are often used in health survey research. The Bayesian approach provides a framework for taking advantage of prior information to develop the accurate predictions needed to make RSD function as efficiently as possible. Given that the application of RSD to existing federal data collections has increased cost efficiency by up to 25%, even greater gains in efficiency may be possible by using Bayesian methods to improve the accuracy of predicted data collection outcomes. Combining simulation work and analyses of real health survey data, the proposed research project aims to 1) elicit reasonable prior distributions for the parameters in response propensity models that form the core of many RSD decisions, 2) experimentally evaluate the effectiveness of Bayesian methods in informing adaptive interventions in an RSD framework, and 3) use Bayesian methods to optimize two-phase sampling decisions in a way that minimizes costs and maximizes the quality of survey estimates.

Public Health Relevance

The U.S. federal government dedicates substantial resources to the annual collection of health survey data that is ultimately used by policy-makers and researchers in the health sciences. Unfortunately, health survey data collection is prone to uncertainty, and survey methodologists have developed new tools that enable survey researchers to 1) track indicators of survey errors and costs during data collection, and 2) introduce mid-stream design changes in response to changes in these indicators. Despite their recent gains in popularity, these tools lack a coherent analytic framework that ?learns? from the data as it is collected; thus, the proposed research aims to develop approaches that take advantage of prior information available from previous study iterations and/or expert opinion to improve the prediction of data collection outcomes.

National Institute of Health (NIH)
National Institute on Aging (NIA)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Bhattacharyya, Partha
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Biostatistics & Other Math Sci
Organized Research Units
Ann Arbor
United States
Zip Code