Panel surveys are a powerful tool for measuring individuals, households, and economic units, but almost all suffer from panel attrition. Panel attrition, whereby those participating in the first wave of a panel drop out in later waves, reduces the effective sample size and can introduce bias in survey estimates if the tendency to drop out is systematically related to the substantive outcomes of interest. It is not possible for analysts to determine the degree to which attrition degrades analyses by using only the collected data without making untestable assumptions about the attrition process. External sources of information are needed. Refreshment samples -- new, randomly-sampled respondents given the questionnaire at the same time as a second or subsequent wave of the panel -- can provide this information. The project develops a variety of novel statistical methodologies for utilizing the information in refreshment samples to correct biases due to panel attrition. The underlying idea is to use the original and refreshment data to estimate statistical models for imputation of the missing values, thereby resulting in completed datasets that effectively correct for biases caused by attrition. Applications of the methods will be made to two high-profile panel studies with refreshment samples: the 2006-2008 General Social Survey and the 2007-2008 AP/Yahoo News Election Panel.

This research will improve statistical analyses of panel studies with refreshment samples, hence enabling more accurate conclusions from panel datasets. More specifically, the project will provide government agencies that sponsor large panel studies with refreshment samples with better options for creating public-use datasets that account for attrition. The research also will inform the design of future panel studies by demonstrating the virtues of refreshment samples when coupled with appropriate statistical methods and software.

Project Report

Longitudinal or panel surveys, in which the same individuals are interviewed repeatedly at different points in time, are an important source of data across a wide range of academic disciplines, government agencies, and business sectors. Unfortunately, most panel surveys suffer from attrition, i.e., people drop out at later waves of the survey. Using only the cases that completed all waves of the survey can result in inaccurate estimates; for example, in a multi-wave opinion poll, only individuals with strong opinions might remain in all waves. In this project, we developed methods that analysts can use to correct inferences for the effects of attrition. Our methods leverage the information in refreshment samples, which are random samples of new respondents given the questionnaire at the same time as a second or subsequent wave of the panel. With appropriate statistical models, these samples can offer information that allow analysts to adjust for the effects of attrition under mild assumptions about the reasons for attrition. The intellectual merit of our project focused on the following research activities. We developed methods appropriate for panels with more than two waves of data collection in the panel, and more than one refreshment sample. Previous methodology focused on surveys with only two panel waves and one refreshment sample. We developed two methods for leveraging the information in refreshment samples when the data have a large number of categorical variables. Previous methodology focused on applications with small numbers of variables. We developed methodology for assessing the sensitivity of estimates to nonresponse in the initial wave of the panel and in the refreshment sample. Such nonresponse has been largely disregarded in the literature on refreshment samples, although, as we show, it can invalidate the methods used to correct for attrition. We analyzed several multi-wave political opinion polls, assessing and correcting the potential effects of attrition. For example, applications of the methods found that panel attrition had biased estimates of Obama favorability in the 2008 election, biased estimates of political interest late in the campaign, and even changed the conclusion about the relationship between gender and campaign interest. These results highlight that panel attrition is not just a technical issue of interest only to methodologists; it has direct implications for the substantive knowledge claims that can be made from panel surveys. We adapted the ideas from the refreshment sample modeling methodology to incorporate external information, such as approximately known counts for sub-populations, in latent class models. These ideas can be used to correct for biases arising from certain types of nonresponse. Our project also had the following broader impacts. The methods developed in this project inform the design and analysis of longitudinal surveys, benefitting scholars who focus on survey methods and practitioners who conduct surveys in academic, commercial, and non-profit sectors. We trained two PhD students in statistical science. One student used research from the project to complete her dissertation. She has accepted a faculty position at a top research university. The other student has used research from this project for her PhD in statistical science. We trained one PhD student in political science. She has used research from this project for her PhD in political science. We mentored a postdoctoral associate in political science. As part of his duties, the postdoctoral associate advised researchers at Duke on the design and implementation of panel studies.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
1061241
Program Officer
Cheryl L. Eavey
Project Start
Project End
Budget Start
2011-06-01
Budget End
2014-05-31
Support Year
Fiscal Year
2010
Total Cost
$160,000
Indirect Cost
Name
Duke University
Department
Type
DUNS #
City
Durham
State
NC
Country
United States
Zip Code
27705