Core 1 - Research &Development Contemporary biomedical and behavioral sciences require sophisticated computation. In Core 1, a team of quantitative scientists (information and computer scientists, biostatisticians, mathematicians, and software engineers) will develop the software infrastructure (i.e. the BCl core), services, and tools for use by biomedical and behavioral researchers. An illustration of major components is shown in Figure B-1. Current state of the art research infrastructures containing biomedical data warehouses essentially have three levels of data disclosure: (1) query results counts, (2) de-identified data, and (3) identified data. Deidentification and anonymization are related, but different concepts. While de-identification consists of removal of particular identifiers, anonymization provides a means for data not be traced back to one particular individual. Simplistic measures (Murphy SN &Chueh HC 2002) are cun-ently applied to step (1) above to prevent the tracing of information to a particular individual using the results of several query counts, and previous research indicates that the de-identification of data disclosed at level (2) is not sufficient to preserve individual privacy (Sweeney 1997). Therefore, at both levels (1) and (2) robust anonymization algorithms are necessary. Formal proofs for adherence to quantitative privacy criteria are hard to produce, and consequently only available for a few methods in limited settings (Lasko 2007). As a consequence, most approaches in use today have not been rigorously validated theoretically or with real data. The three levels of disclosure outlined above are insufficient for responsible data sharing beyond the scope of an institutional IRB (in a HIPAA covered entity) such as a federated data warehouse to which multiple institutions or sources can contribute data. For this and other reasons, institutional clinical data repositories for research, some of which receive federal funding for their creation and/or maintenance, have been restricted to researchers who are formally affiliated with the institution. To address this limitation and progress towards a stage in which data can be shared across institutions, we propose research into: (a) a tool that interfaces between clinical data and a user, and that can answer limited queries while ensuring that privacy is preserved, (b) a tool that can simulate real data in a privacy preserving manner to the point that the simulated data can be used as a proxy in population based analyses, and (c) a cryptographic data submission protocol that hides the identity of the submitting entity.

Agency
National Institute of Health (NIH)
Institute
National Heart, Lung, and Blood Institute (NHLBI)
Type
Specialized Center--Cooperative Agreements (U54)
Project #
5U54HL108460-04
Application #
8509017
Study Section
Special Emphasis Panel (ZRG1-BST-K)
Project Start
Project End
Budget Start
2013-07-01
Budget End
2014-06-30
Support Year
4
Fiscal Year
2013
Total Cost
$2,121,472
Indirect Cost
$657,079
Name
University of California San Diego
Department
Type
DUNS #
804355790
City
La Jolla
State
CA
Country
United States
Zip Code
92093
Chen, Luyao; Aziz, Md Momin; Mohammed, Noman et al. (2018) Secure large-scale genome data storage and query. Comput Methods Programs Biomed 165:129-137
Groat, Danielle; Soni, Hiral; Grando, Maria Adela et al. (2018) Self-Reported Compensation Techniques for Carbohydrate, Exercise, and Alcohol Behaviors in Patients With Type 1 Diabetes on Insulin Pump Therapy. J Diabetes Sci Technol 12:412-414
Nguyen, Nghia H; Khera, Rohan; Ohno-Machado, Lucila et al. (2018) Annual Burden and Costs of Hospitalization for High-Need, High-Cost Patients With Chronic Gastrointestinal and Liver Diseases. Clin Gastroenterol Hepatol 16:1284-1292.e30
Weng, Wei-Hung; Wagholikar, Kavishwar B; McCray, Alexa T et al. (2017) Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach. BMC Med Inform Decis Mak 17:155
Groat, Danielle; Grando, Maria A; Thompson, Bithika et al. (2017) A Methodology to Compare Insulin Dosing Recommendations in Real-Life Settings. J Diabetes Sci Technol 11:1174-1182
Doan, Son; Ritchart, Amanda; Perry, Nicholas et al. (2017) How Do You #relax When You're #stressed? A Content Analysis and Infodemiology Study of Stress-Related Tweets. JMIR Public Health Surveill 3:e35
Grando, Maria Adela; Groat, Danielle; Soni, Hiral et al. (2017) Characterization of Exercise and Alcohol Self-Management Behaviors of Type 1 Diabetes Patients on Insulin Pump Therapy. J Diabetes Sci Technol 11:240-246
Burgoyne, Adam M; De Siena, Martina; Alkhuziem, Maha et al. (2017) Duodenal-Jejunal Flexure GI Stromal Tumor Frequently Heralds Somatic NF1 and Notch Pathway Mutations. JCO Precis Oncol 2017:
Chen, Feng; Wang, Shuang; Jiang, Xiaoqian et al. (2017) PRINCESS: Privacy-protecting Rare disease International Network Collaboration via Encryption through Software guard extensionS. Bioinformatics 33:871-878
Vaidya, Jaideep; Shafiq, Basit; Asani, Muazzam et al. (2017) A Scalable Privacy-preserving Data Generation Methodology for Exploratory Analysis. AMIA Annu Symp Proc 2017:1695-1704

Showing the most recent 10 out of 176 publications