? Biomedical research is increasingly relying on information gathered at the point of care in addition to traditional clinical trial data collection. The HIPAA privacy rule requires that reasonable safeguards against unwanted disclosure be taken before dissemination of patient data. Quantification of what constitutes """"""""reasonable safeguards"""""""" remains elusive, however. Hence, most de-identification strategies used in practice today rely on simple suppression of identifiers such as name, address, and social security number. Several studies, by our group and others, have shown that these simple strategies are insufficient. As demonstrated by Lin et al.[42], there might be data for which disclosure is not possible without compromising privacy. Ultimately, a quantitative analysis must guide the determination of whether safeguards are reasonable or not. In order to address these issues, we propose to continue our investigation on the quantification of trade-offs between data disclosure and privacy protection, taking into account linkable attributes in the data. In this proposal, we seek to continue our research as follows: ? (1) Theory. Strengthen the theoretical foundations of disclosure control by further investigation of the problem of minimizing information loss while ensuring a predefined level of ambiguity with respect to patient identity, and developing a theory for linking patient data that has been subjected to disclosure control methods. ? (2) Tools. We will construct a tool that links data being considered for disclosure with data that are kept in a repository. This tool will supply information that aids in (i) evaluating disclosure control measures empirically, and (ii) enabling sensitivity analyses of vulnerability conditioned on both parameters used by the disclosure control mechanism and assumptions regarding adversarial data possession. Information thus obtained can (a) be used to recommend transformations so that the risk of privacy breaches can be decreased to a desired level, and (b) serve as a quantitative basis for the determination of safeguard reasonableness. ? (3) Evaluation of disclosure control algorithms. We will evaluate algorithms for disclosure control developed by our group and others using the tool described in (2). ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM007273-06
Application #
7495044
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
2006-09-15
Project End
2010-09-14
Budget Start
2008-09-15
Budget End
2010-09-14
Support Year
6
Fiscal Year
2008
Total Cost
$205,862
Indirect Cost
Name
Brigham and Women's Hospital
Department
Type
DUNS #
030811269
City
Boston
State
MA
Country
United States
Zip Code
02115
Mehta, Sanjay R; Vinterbo, Staal A; Little, Susan J (2014) Ensuring privacy in the study of pathogen genetics. Lancet Infect Dis 14:773-777
Vinterbo, Staal A; Sarwate, Anand D; Boxwala, Aziz A (2012) Protecting count queries in study design. J Am Med Inform Assoc 19:750-7
Lasko, Thomas A; Vinterbo, Staal A (2010) Spectral Anonymization of Data. IEEE Trans Knowl Data Eng 22:437-446
Vinterbo, Staal A; Dreiseitl, Stephan; Ohno-Machado, Lucila (2006) Approximation properties of haplotype tagging. BMC Bioinformatics 7:8
Vinterbo, Staal A; Kim, Eun-Young; Ohno-Machado, Lucila (2005) Small, fuzzy and interpretable gene expression based classifiers. Bioinformatics 21:1964-70
Ohno-Machado, Lucila; Silveira, Paulo Sergio Panse; Vinterbo, Staal (2004) Protecting patient privacy by quantifiable control of disclosures in disseminated databases. Int J Med Inform 73:599-606
Weber, Griffin; Vinterbo, Staal; Ohno-Machado, Lucila (2004) Multivariate selection of genetic markers in diagnostic classification. Artif Intell Med 31:155-67