Concerns about privacy and personal identity impede use of data about genomic variation, phenotypes, demographics, and exposures from large numbers of people to uncover the contributions of such information on health and disease, knowledge that can improve clinical care. People worry that these data and genomic data in particular, cannot be secured. Many fear that data about them will be used in ways they oppose (e.g., to deny them and those they love access to jobs and insurance) because existing legal rules about such uses are not comprehensive. Additionally, others worry that genomic data will undermine their self-understanding, for example, by identifying misattributed parentage or challenging beliefs about ancestral history. To evaluate and address concerns about privacy and identity, more research needs to be done to identify the risks - both real and perceived - and the benefits of data gathering in order more effectively to inform the public and policy. The driving hypothesis of the Genetic Privacy and Identity in Community Settings - GetPreCiSe Center, is that the debate about genetic privacy and identity to date has been based (a) on an incomplete understanding of the influences on the actors involved in genomics research and translation and (b) on possible, rather than probable, risks. Thus, too much of the research has focused on what individuals say, effectively minimizing the role of community and social influences in shaping attitudes toward privacy. Evidence already suggests that the likelihood that a person will be re-identified or harmed in some manner is quite low. Thus, current policies can both over- and under-protect people and data, depending on the context. This project will convene a diverse group of scholars with broadly interdisciplinary perspectives and community advisors to work together to develop a more comprehensive understanding of these worries and the factors that influence them, to model actual risks to privacy and identity, all of which will b used to inform policy. GetPreCiSe is guided by four interacting specific aims: 1. To enhance our understanding of the impact of threats to privacy and identity in genomic data. Concerns about privacy and identity involve the 1) Individual, who has been the focus of most policy debate, as well as 2) Families, 3) Communities, and 4) various Social institutions, each facing its own distinctive calculus of risk and benefit. All of these are subject to an array of influences. 2. To measure the efficacy of efforts to protect privacy and identity by 1) Communities, 2) IRBs, 3) Institutions that collect, hold, and share genomic and phenomic data, and 4) Law, using quantitative, analytic, and legal analyses. 3. To develop models to quantify the probability of genomic data re-identification and harm that take into account the set of influencing factors and efforts to protect data, as well as the costs that attacker(s) would incur to mount the attack and the potential benefits and penalties that the attacker(s) might receive with a successful attack. 4 To address concerns by developing interventions that provide certainty and enhance institutional trust, as well as policy solutions that could deter intrusions of privacy and misuse.

Public Health Relevance

Genomics research and the incorporation of its fruits into clinical care are impeded by concerns about privacy and identification. These worries, which have multiple sources, are poorly understood, and legal and social efforts to address them are inadequate. The Genetic Privacy and Identity in Community Settings - GetPreCiSe Center will use broadly interdisciplinary approaches to develop a more complete understanding of these concerns, which when used to inform modeling will allow development of evidence-based methods to allay concerns, thereby facilitating the use of genomics to improve health.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project with Complex Structure (RM1)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1-HGR-P (J1))
Program Officer
Mcewen, Jean
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
Independent Hospitals
United States
Zip Code
Hazel, J W; Clayton, E W; Malin, B A et al. (2018) Is it time for a universal genetic forensic database? Science 362:898-900
Xia, Weiyi; Wan, Zhiyu; Yin, Zhijun et al. (2018) It's all in the timing: calibrating temporal penalties for biomedical data sharing. J Am Med Inform Assoc 25:25-31
Wan, Zhiyu; Vorobeychik, Yevgeniy; Kantarcioglu, Murat et al. (2017) Controlling the signal: Practical privacy protection of genomic data sharing through Beacon services. BMC Med Genomics 10:39
Wang, Shuang; Jiang, Xiaoqian; Tang, Haixu et al. (2017) A community effort to protect genomic data sharing, collaboration and outsourcing. NPJ Genom Med 2:33
Prasser, Fabian; Gaupp, James; Wan, Zhiyu et al. (2017) An Open Source Tool for Game Theoretic Health Data De-Identification. AMIA Annu Symp Proc 2017:1430-1439
Wan, Zhiyu; Vorobeychik, Yevgeniy; Xia, Weiyi et al. (2017) Expanding Access to Large-Scale Genomic Data While Promoting Privacy: A Game Theoretic Approach. Am J Hum Genet 100:316-322