Significance: In this SBIR project, we propose to improve the performance of InSight, a machine-learning- based sepsis screening system, in situations of limited training data from the target clinical site. The proposed work will make possible prospective clinical deployments to sites which are smaller or lack clinical data repositories, by significantly reducing the amount of training data necessary down to a few weeks of clinical observation. Classically, a machine-learning-based system like InSight requires complete retraining for each new clinical setting, in turn requiring a new and large collection of data from each target deployment site. We will circumvent this requirement via transfer learning techniques, which transfer knowledge acquired previously in a source clinical setting to a new, target setting. Research Questions: Which transfer learning methods and paired classification algorithms are most suitable for use with InSight, requiring minimal target-site training data while maintaining strong performance? Are these methods and algorithms robust across the several common sepsis-spectrum definitions? Prior Work: We have developed InSight using the MIMIC-III retrospective data set, on which it attains an area under the receiver operating characteristic curve (AUROC) of 0.88 for sepsis detection, and 0.74 for 4-hour early sepsis prediction. We have also conducted pilot transfer learning ? experiments in a different clinical task, mortality forecasting, in which transfer learning yields a 10-fold reduction in the amount of target-site training data required to achieve AUROC 0.80.
Specific Aims :
Aim 1 - to implement and assess side-by-side four diverse transfer learning methods for a retrospective clinical sepsis prediction task, where the source data set is MIMIC-III and the simulated clinical target is a data set drawn from UCSF.
Aim 2 - to determine which among the best methods from Aim 1 also provide robust performance when applied to two additional sepsis-spectrum gold standards. Methods: We will prepare implementations of transfer learning methods which use instance transfer, residual learning and/or feature augmentation, kernel length scale transfer, and feature transfer. We will test these methods with applicable classifiers on subsets of the UCSF set, using cross-validation and quantifying discrimination performance in terms of AUROC. The best method/classifier pairs will require no more than 30 examples of septic patients from the target set and attain AUROC superiorities of 0.05 in 0- and 4-hour pre-onset sepsis prediction/detection, relative to the best tested alternative screening systems (Aim 1). The top three pairs will then be tested for robustness to gold standard choice, using septic shock (0- and 4-hour) and SIRS-based sepsis (0-hour) gold standards; in these tests, at least one pair must again attain 0.05 margin of superiority in AUROC versus the alternative screening systems (Aim 2). Future Directions: The results of these experiments will enable InSight to be robustly deployed to diverse clinical sites, yielding high performance without the need for extensive target-site data acquisition.

Public Health Relevance

Clinical decision support (CDS) systems present critical information to medical professionals by examining patient data and providing relevant information. Machine learning is a powerful method for creating CDS tools, but accessing its full strength requires re-training with retrospective data from each target clinical site. We will use transfer learning techniques to dramatically reduce the amount of target-site training data required by InSight, our machine-learning-based CDS tool for sepsis prediction, and empirically evaluate several such methods on a patient data set, using three different sepsis-related gold standards.

National Institute of Health (NIH)
National Center for Advancing Translational Sciences (NCATS)
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Colvis, Christine
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Dascena, Inc.
United States
Zip Code