Federated learning enables hospitals to collaboratively learn a shared global model while ensuring patient privacy; however, there is a big statistical challenge for our application owing to EHR heterogeneities, i.e. difference in patient characteristics and clinical observations made or feature space. Thus, real-world EHR data from different hospitals are never independently and identically distributed (IID). The proposed research is to overcome this statistical challenge while improving security for federated learning by leveraging a large integrated EHR dataset with medical records for more than 21 million patients from 12 healthcare systems spanning across 9 US states. A novel privacy-preserving federated transfer learning framework is proposed for building a robust and accurate AKI prediction model that require learning on real-world EHR data from siloed healthcare systems. This project will (1) develop novel transfer learning solutions to address three distinct non-IID EHR data analytic scenarios, (2) develop a novel federated learning framework with a dynamic weighting aggregation mechanism to build a robust and accurate Acute kidney injury (AKI) prediction model; and (3) develop a comprehensive privacy-preserving federated transfer learning framework with novel privacy-preserving solutions to address the unique privacy challenges in the proposed transfer learning applications.

The project proposes new transfer learning solutions to combat the non-IID challenge in federated learning and new security building blocks tailored for homogeneous and heterogeneous transfer learning tasks. Together the project will develop a privacy-preserving federated transfer learning framework to provide a first practical solution for non-IID clinical data scenarios. Our research methods and findings will provide promising new directions to machine learning for healthcare and will contribute to both academic research and potential commercialized products. More importantly, the interpretable nature of the base gradient boosting machine model in the proposed federated transfer learning framework will provide better understanding of the predictors from which clinicians can use to design prevention and management strategies for high-risk patients. This project is jointly funded by Smart and Connected Health and the Established Program to Stimulate Competitive Research (EPSCoR).

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
2014552
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2020-10-01
Budget End
2024-09-30
Support Year
Fiscal Year
2020
Total Cost
$579,941
Indirect Cost
Name
University of Kansas
Department
Type
DUNS #
City
Lawrence
State
KS
Country
United States
Zip Code
66045