New Statistical and Computing Technologies for Bereaking the Barrier to Medical Data Sharing

Wu, Samuel; Chen, Shigang

Abstract

New Statistical and Computing Technologies for Breaking the Barrier to Medical Data Sharing ABSTRACT Modern biomedical research and clinical trials, aided by digital technologies, are producing vast volumes of data, which are however scattered at different institutes, universities, hospitals, and doctors' of?ces across the world. Tremendous additional values can be uncovered if medical data from different sources can be pooled together and shared among researchers. Two major initiatives by the NIH (BD2K) and the National Academies (the IOM committee for sharing clinical trial data) have been created to address the pressing issues of data availability and accessibility. However, there are major barriers to data sharing, particularly the laws and regulations for privacy protection, which have greatly slowed the free ?ow of medical data and routinely resulted in lengthy processes (e.g., IRB approval and training) before data can be accessed. Recognizing the problem, ongoing efforts are made to improve data availability through enhanced information technologies, uniform data representations, better medical research practices, etc. But even the optimistic target set by the recent report from the IOM committee is 18 months before data will be shared with external users, subject to various concerns of data sensitivity, ?nancial and research interest protection, and due processes for risk evaluation. Complementary to the existing efforts, this project takes a different path to develop new statistical and computing tools in an effort to break the privacy barrier for immediate data access and free data movement, without breaking the data privacy. Such tools, even with narrower scope and applicability, can be highly valuable in timely disseminating certain information (in secure and restricted forms) before the slower approval process for more comprehensive data release is completed. The proposed research will combine expertise from biostatistics, computer science, cyber-security, and medical practice to develop a secure data sharing framework that enables large-scale dissemination of medical data from different sources, while providing provably-strong privacy protection. To achieve this goal, we will investigate a new set of data masking technologies with three desirable properties. (1) Data Security: the masked medical data can be published and shared freely without the danger of leaking any pre-masking raw data. (2) Data Utility: an array of practically important statistical properties are preserved by the masked data, such that statistical inference on parameters of interest will produce exactly the same results from the masked data as from the original data, under general linear model, chi-squared test, logistic regression, contingency table, and other statistical methods frequently used by medical research. (3) Data Ubiquity: the new framework provides convenient channels not only for the established data sources such as hospitals and medical institutes to publish data, but also for individual investigators and patients to participate in data collection and sharing through means of crowd sourcing.

Public Health Relevance

The interdisciplinary project team will combine expertise from biostatistics, computer science, cyber-security, and medical practice to develop data sharing technologies, in particular, methods of data masking that promise to allow free exchange of medical information with strong privacy protection.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM118737-03
Application #: 9747306
Study Section: Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer: Brazhnik, Paul

Project Start: 2017-09-15
Project End: 2020-07-31
Budget Start: 2019-08-01
Budget End: 2020-07-31
Support Year: 3
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: University of Florida
Department: Biostatistics & Other Math Sci
Type: Schools of Medicine
DUNS #: 969663814

City: Gainesville
State: FL
Country: United States
Zip Code: 32611

Related projects


NIH 2019 R01 GM	New Statistical and Computing Technologies for Bereaking the Barrier to Medical Data Sharing Wu, Samuel S.; Chen, Shigang / University of Florida
NIH 2018 R01 GM	New Statistical and Computing Technologies for Bereaking the Barrier to Medical Data Sharing Wu, Samuel S.; Chen, Shigang / University of Florida
NIH 2017 R01 GM	New Statistical and Computing Technologies for Bereaking the Barrier to Medical Data Sharing Wu, Samuel S.; Chen, Shigang / University of Florida

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: