Machine Learning for Integrative Modeling of the Immune System in Clinical Settings In response to an immunological challenge, immune cells act in concert forming complex and dense networks. A deep understanding of these immune responses is often the first step in developing immune therapies and diagnostic tests. Multivariate modeling algorithms can simultaneously consider all measured aspects of the immune system but requires prohibitively larger cohort sizes as technological advancements increase the number of measurements (a.k.a., ?Curse of Dimensionality?). To address this, we propose a series of studies to develop machine learning algorithms for comprehensive profiling of the immune system in clinical settings. Particularly, for analysis of the immune system at a single-cell-level, we will leverage the stochastic nature of clustering algorithms to produce a robust pipeline for prediction of clinical outcomes. Next, we introduce the immunological Elastic-Net (iEN) algorithm, which addresses both the curse of dimensionality and reproducibility by integrating prior immunological knowledge into the models. The cellular systems that govern immunity act through symbiotic interactions with multiple interconnected biological systems. The simultaneous interrogation of these systems with suitable technologies can reveal otherwise unrecognized crosstalk. In collaboration with several leading laboratories, we have produced multiomics datasets (including analysis the genome, proteome, microbiome, and metabolome) in synchronized groups of patients. Using these coordinated datasets, we will evaluate several algorithms for combining multiple biological modalities while accounting for the intrinsic characteristics of each assay, to reveal biological cross- talk across various systems and increase combined predictive power. Importantly, numerous population- level factors (including medical history, environmental, and socioeconomic factors) significantly impact the immune system and studies focused on homogenous patient populations often lack generalizability to other populations. To address this, we will develop machine learning strategies to integrate population-level factors directly into our immunological data. These models will objectively define subpopulations of patients and enable flexibility in the coefficients of the models (and hence, the importance of the various biological measurements) in each group. This research program will be executed using data from several biorepositories focused on various diseases. This approach will ensure generalizability of our work to previously unseen datasets and increase the long-term impact of our findings. Throughout the proposal, a major area of focus is the development of visualization and model-reduction strategies that lay the foundation for interpretation of complex models. The machine learning algorithms developed will be readily applicable to a broad range of multiomics and multicohort studies and will be available as open-source software.

Public Health Relevance

Recent technological advances have enabled the production of large immune monitoring datasets, providing an opportunity for systems-level efforts to harness the power of the immune system in developing immune therapies and diagnostic tests. In this project, we will develop machine learning algorithms for analysis of the immune system at a single-cell level, in a multiomics setting integrated with various other biological measurements, and subject to adjustments based on population-level factors. This work will provide a strong quantitative bridge between large-scale epidemiologic trends and deep biological profiling to investigate the complex mechanisms that govern the immune system in clinical settings.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Unknown (R35)
Project #
1R35GM138353-01
Application #
10028766
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Ravichandran, Veerasamy
Project Start
2020-09-05
Project End
2025-06-30
Budget Start
2020-09-05
Budget End
2021-06-30
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Stanford University
Department
Anesthesiology
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305