The effective and efficient utilization of the big data accumulated in biomedical sciences, including the genomic, imaging data collected from patients that are often integrated with their electronic health records represents a great opportunity as well as a big challenge for the biomedical data science. Because biomedical data are collected from individual patients, and thus carry identifiable information of the data donors, the projection of their privacy becomes an important concern in large-scale projects including the recently launched Precision Medicine Initiative. In the past few years, significant progresses have been made on cryptographic techniques, including the homomorphic encryption (HME) that enables a direct analysis of encrypted data without decrypting it, and the Secure Multiparty Computing (SMC) that allows two or more organizations to jointly compute a task without exposing to each other?s inputs. Here, based on these techniques, we propose to develop a suite of encryption protocols and open-source software tools that can be used by biomedical researchers in a plug-and-play manner for the statistical analysis of encrypted biomedical data. We note that our methods assume biomedical data will be protected by encryption once they are generated, and the subsequent analysis and sharing will always be performed on the encrypted form, which thus can achieve a high security standard for privacy protection.
We propose to develop encryption methods for biomedical data mining, and to implement these methods in open-source software that can be used by biomedical researchers in a plug-and- play manner for the statistical analysis of encrypted biomedical data. Following our approach, biomedical data will be protected by encryption once they are generated, and the subsequent analysis and sharing will always be performed on the encrypted form, which thus can achieve a high security standard for privacy protection in biomedical data science.
Kim, Miran; Song, Yongsoo; Wang, Shuang et al. (2018) Secure Logistic Regression Based on Homomorphic Encryption: Design and Evaluation. JMIR Med Inform 6:e19 |
Kim, Andrey; Song, Yongsoo; Kim, Miran et al. (2018) Logistic regression model training based on the approximate homomorphic encryption. BMC Med Genomics 11:83 |
Sadat, Md Nazmus; Jiang, Xiaoqian; Aziz, Md Momin Al et al. (2018) Secure and Efficient Regression Analysis Using a Hybrid Cryptographic Framework: Development and Evaluation. JMIR Med Inform 6:e14 |
Wang, Meng; Ji, Zhanglong; Kim, Hyeon-Eui et al. (2018) Selecting Optimal Subset to release under Differentially Private M-estimators from Hybrid Datasets. IEEE Trans Knowl Data Eng 30:573-584 |
Bonomi, Luca; Jiang, Xiaoqian (2018) Linking temporal medical records using non-protected health information data. Stat Methods Med Res 27:3304-3324 |
Bu, Diyue; Wang, Xiaofeng; Tang, Haixu (2018) Real-time Protection of Genomic Data Sharing in Beacon Services. AMIA Jt Summits Transl Sci Proc 2017:45-54 |
Miotto, Riccardo; Wang, Fei; Wang, Shuang et al. (2018) Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 19:1236-1246 |
Vaidya, Jaideep; Shafiq, Basit; Asani, Muazzam et al. (2017) A Scalable Privacy-preserving Data Generation Methodology for Exploratory Analysis. AMIA Annu Symp Proc 2017:1695-1704 |
Ghasemi, Reza; Al Aziz, Md Momin; Mohammed, Noman et al. (2017) Private and Efficient Query Processing on Outsourced Genomic Databases. IEEE J Biomed Health Inform 21:1466-1472 |
Wang, Shuang; Jiang, Xiaoqian; Tang, Haixu et al. (2017) A community effort to protect genomic data sharing, collaboration and outsourcing. NPJ Genom Med 2:33 |
Showing the most recent 10 out of 14 publications