The recent explosion in demands for microdata from researchers and policy makers, especially when the data collection is paid for with public funds, has increased concerns about confidentiality protection. Confidentiality of responses is a serious commitment made by data-collecting agencies to participants in the study. A similar commitment is expected of any agency that disseminates the data to researchers and policy makers. Along with the increased demands for microdata, a number of commercial databases with the identifying information such as names and addresses and demographic information have also become accessible. These databases raise the concern that an intruder can potentially link the anonymous survey data released by the data collection agencies for public use with the commercial databases to identify one or more respondents to the survey. This research proposal has three primary objectives: (1) To assess the risk of disclosure using data from four test-bed national probability surveys covering a wide variety of topics. The risk will be addressed using two broad classes of intruder models. Type I, where an individual, personally known to the intruder, is known by the intruder to be in the survey; and Type II, where an intruder with access to an external database with names and addresses is seeking to identify respondents in the survey and hence gain access to confidential information; (2) To develop and evaluate new methods to avoid disclosure, and (3) To develop strategies for replacing variables in public-use data sets deemed to increase the risk of disclosure by summary variables that allow users to adjust or control for these variables without knowing their actual values.

National Institute of Health (NIH)
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Research Program Projects (P01)
Project #
Application #
Study Section
Pediatrics Subcommittee (CHHD)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Ann Arbor
United States
Zip Code
Couper, Mick P; Singer, Eleanor (2013) Informed Consent for Web Paradata Use. Surv Res Methods 7:57-67
Carroll, Tamar W; Gutmann, Myron P (2011) The limits of autonomy: the Belmont Report and the history of childhood. J Hist Med Allied Sci 66:82-115
Couper, Mick P; Singer, Eleanor; Conrad, Frederick G et al. (2010) Experimental Studies of Disclosure Risk, Disclosure Harm, Topic Sensitivity, and Survey Participation. J Off Stat 26:287-300
An, Di; Little, Roderick J A; McNally, James W (2010) A multiple imputation approach to disclosure limitation for high-age individuals in longitudinal studies. Stat Med 29:1769-78
Singer, Eleanor; Couper, Mick P (2010) Communicating disclosure risk in informed consent statements. J Empir Res Hum Res Ethics 5:1-8
Lazer, David; Pentland, Alex; Adamic, Lada et al. (2009) Social science. Computational social science. Science 323:721-3
Couper, Mick P; Singer, Eleanor (2009) The role of numeracy in informed consent for surveys. J Empir Res Hum Res Ethics 4:17-26
Singer, Eleanor; Couper, Mick P (2008) Do incentives exert undue influence on survey participation? Experimental evidence. J Empir Res Hum Res Ethics 3:49-56
Gutmann, Myron; Witkowski, Kristine; Colyer, Corey et al. (2008) Providing Spatial Data for Secondary Analysis: Issues and Current Practices relating to Confidentiality. Popul Res Policy Rev 27:639-665
VanWey, Leah K; Rindfuss, Ronald R; Gutmann, Myron P et al. (2005) Confidentiality and spatially explicit data: concerns and challenges. Proc Natl Acad Sci U S A 102:15337-42