Health information technology has enabled healthcare community to store and share a large amount of health and healthcare data electronically. While secondary use of this data has significantly enhanced the quality and efficiency of medical and healthcare research, there is a growing concern about privacy due to such use of personal data. The goal of this research, as a response to this challenge, is to develop and test a novel data- masking technology that can be used by healthcare organizations to prevent or limit privacy disclosure when sharing patient data for research. To protect patient privacy, the Health Insurance Portability and Accountability Act (HIPAA) has established a set of rules concerning what information cannot be released to a third party. However, studies have shown that the HIPAA rules lack the flexibility to adequately meet the diverse needs of data users;they can be under- protective in some cases and over-protective in others. Recognizing this limitation, HIPAA also provides guidelines that enable a scientific assessment of privacy disclosure risk to determine if the data is appropriate for release. This research focuses on this aspect of HIPAA and its related topics.
The specific aims of this research are: (1) to identify weakness in the HIPAA rule-based privacy protection mechanism and demonstrate this problem using data available to users with different access levels;(2) to propose metrics for assessing and quantifying privacy disclosure risk and data utility;(3) to develop methods and techniques for privacy protection when sharing and disseminating data;and (4) to conduct experiments to evaluate the afore-mentioned risk and utility metrics, and data-masking techniques. The proposing team has identified an effective technique to systematically compromise data privacy. This provides a basis for a more thorough study to achieve specific aim 1. Methods grounded on statistics and information theory will be employed to construct the metrics for specific aim 2. The data-masking approach for specific aim 3 employs an innovative divide-and-counter strategy, which first partitions data into subsets and then masks the data within each subset. Experimental design for specific aim 4 involves performance evaluations in terms of disclosure risk, data utility, and computational scalability, using three categories of data: clinical data, Medicare claims, and publicly available personal data. This research is highly relevant to the mission of NIH. By adequately protecting privacy, the proposed technology will alleviate concerns about loss of participant confidentiality and enable improved quality and efficiency for research based on secondary use of data. This will greatly help design and develop """"""""programs for the collection, dissemination, and exchange of information in medicine and health,"""""""" thereby achieving NIH's goal to """"""""expand the knowledge base in medical and associated sciences."""""""" This research will also offer valuable insights for policy makers to assess the tradeoff between privacy protection and data sharing and analysis.

Public Health Relevance

NEW TECHNOLOGY TO PRESERVE PATIENT PRIVACY AND DATA QUALITY IN HEALTH RESEARCH PROJECT NARRATIVE This research addresses privacy concerns due to secondary use of health and healthcare data. The goal of this research is to develop and test a novel data-masking technology that can be used by healthcare organizations to prevent or limit privacy disclosure when sharing patient data for research. This research is highly relevant to the mission of NIH in that it will alleviate concerns about loss of participant confidentiality and enable high quality research, which will greatly help design and develop programs for the collection, dissemination, and exchange of information in medicine and health, thereby achieving NIH's goals to expand the knowledge base in medical and associated sciences and promote the highest level of scientific integrity, public accountability, and social responsibility in the conduct of science.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM010942-02
Application #
8318617
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
2011-09-01
Project End
2014-08-31
Budget Start
2012-09-01
Budget End
2013-08-31
Support Year
2
Fiscal Year
2012
Total Cost
$226,545
Indirect Cost
$40,674
Name
University of Massachusetts Lowell
Department
Administration
Type
Other Domestic Higher Education
DUNS #
956072490
City
Lowell
State
MA
Country
United States
Zip Code
01854
Gong, Qiyuan; Luo, Junzhou; Yang, Ming et al. (2017) Anonymizing 1:M microdata with high utility. Knowl Based Syst 115:15-26
Li, Xiao-Bai; Qin, Jialun (2017) Anonymizing and Sharing Medical Text Records. Inf Syst Res 28:332-352
Motiwalla, Luvai F; Li, Xiao-Bai (2016) Unveiling consumer's privacy paradox behaviour in an economic exchange. Int J Bus Inf Syst 23:307-329
Liu, Xiaoping; Li, Xiao-Bai; Motiwalla, Luvai et al. (2016) Preserving Patient Privacy When Sharing Same-Disease Data. ACM J Data Inf Qual 7:
Li, Xiao-Bai; Sarkar, Sumit (2014) Digression and Value Concatenation to Enable Privacy-Preserving Regression. MIS Q 38:679-698
Li, Xiao-Bai; Raghunathan, Srinivasan (2014) Pricing and disseminating customer data with privacy awareness. Decis Support Syst 59:63-73
Motiwalla, Luvai; Li, Xiao-Bai (2013) Developing Privacy Solutions for Sharing and Analyzing Healthcare Data. Int J Bus Inf Syst 13:
Li, Xiao-Bai; Sarkar, Sumit (2013) Class Restricted Clustering and Micro-Perturbation for Data Privacy. Manage Sci 59: