The objective of this proposal is to provide a robust course of training for Gilmer Valdes, PhD, DABR, a candidate with an excellent foundation in clinical and machine learning research, to enable him to become an independent investigator. The proposed research aims to address a tradeoff between interpretability and accuracy of modern machine learning algorithms which limits their use in clinical practice. The candidate?s central hypothesis is that the current tradeoff is not a law of nature but rather a limitation of current interpretable machine learning algorithms. Towards proving this hypothesis, the candidate, leading a multidisciplinary team, have developed unique mathematical frameworks (MediBoost and the Conditional Interpretable Super Learner) to build interpretable and accurate models. The proposed research will I) implement and extensively benchmark these frameworks and II) use the algorithms develop to solve three clinical problems where potentially suboptimal models are currently used to make clinical decisions: 1) predicting mortality in the Intensive Care Unit, 2) predicting risk of Hospital Acquired Venous Thromboembolism, 3) predicting which prostate cancer patients benefit the most from adjuvant radiotherapy. The candidate?s training and research plan, multidisciplinary by nature, takes advantage of the proximity of UC San Francisco, Stanford and UC Berkeley and proposes a training plan that cannot be easily replicated elsewhere. Recognizing the multidisciplinary nature of the work proposed, the author will be mentored and work closely with a stellar committee from three institutions and different scientific areas (Machine Learning, Biostatistics, Statistics, Hospital Medicine, Cancer Research and Quality Assurance in Medicine): Jerome H. Friedman PhD (Stanford Statistics Department), Mark Van der Laan PhD (Berkeley Biostatistics and Statistics Department), Mark Segal (UCSF Epidimiology and Biostatistics Deparments), Andrew Auerbach MD (UCSF Medicine Department), Felix Y. Feng MD (UCSF Radiation Oncology),and Timothy D. Solberg PhD (UCSF Radiation Oncology). This committee will be coordinated by Dr Solberg. The candidate also counts with a strong a multidisciplinary team of collaborators. Successful completion of the proposed research will develop the next generation of accurate and interpretable Machine Learning algorithms and solve three important clinical problems where linear models are currently used in clinical settings. This proposal has wide-ranging implications across the healthcare spectrum. The intermediate-term goal is for the candidate to acquire the knowledge, technical skills and expertise necessary to submit a successful R01 proposal.

Public Health Relevance

Current state of the art machine learning algorithms have a marked tradeoff between accuracy and interpretability. In medicine, where errors can have a dire consequence and knowledge representation and validation is as relevant as accuracy, the development of accurate and interpretable algorithms is of paramount importance. My research project will address a critical public health need by developing machine learning algorithms that are both accurate and interpretable, and apply them to solve specific clinical problems.

National Institute of Health (NIH)
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
Clinical Investigator Award (CIA) (K08)
Project #
Application #
Study Section
Special Emphasis Panel (ZEB1)
Program Officer
Peng, Grace
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Schools of Medicine
San Francisco
United States
Zip Code