The NSF Convergence Accelerator supports use-inspired, team-based, multidisciplinary efforts that address challenges of national importance and will produce deliverables of value to society in the near future.

This project, NSF Convergence Accelerator Track D: A Trusted Integrative Model and Data Sharing Platform for Accelerating AI-Driven Health Innovation, will develop a novel health-related federated learning and model-sharing platform, LEARNER, to enable collaborative big data mining for biomedical applications by integrating cross-disciplinary expertise from machine learning, trustworthy AI, and biomedical data science. LEARNER will incorporate novel asynchronous federated learning algorithms based on rigorous theoretical foundations using trustworthy AI techniques, fairness-aware and interpretable machine learning models, large-scale computational strategies and effective software tools to reveal the complex relationships among heterogeneous health data. The project will address critical challenges in exploiting big data for biomedical and health, which include access to large data collections, computational intensity of AI/ML algorithms, complexity of hyperparameter tuning, and the need for effective multidisciplinary expertise and collaboration. Data privacy is another critical concern since health data is intrinsically sensitive and could be exploited to reveal an individual’s identity even when the data are carefully anonymized. LEARNER will include a suite of collaborative data analysis and privacy-preserving mechanisms and tools that will securely support various types of health data analytics, including mechanisms to detect potential data privacy leakages. Machine learning models typically involve complex procedures for optimization and the induced results can be difficult to interpret, and to replicate and reproduce. Novel methods will be employed to improve the interpretability and reproducibility of complex health data analytics models.

The project team, with individuals from academia and industry, will develop an interdisciplinary program for training and education of graduate and undergraduate students. A cross-disciplinary course will also be developed on Health Data Science for beginning graduate students and senior undergraduate students from a variety of programs, including Computer Science and Engineering, Informatics, Electrical Engineering, Biomedical Engineering, Biology, and Statistics. The project will put special emphasis on attracting female and under-represented minority students to explore advanced computational technologies in the context of the LEARNER platform. Interested senior undergraduate students will be able to work on well-defined and well-scoped small projects, which will enable them to work with graduate students and the PI team of the project. Such project could also be undertaken as summer projects by undergraduate students in science and engineering.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Duke University
United States
Zip Code