Understanding the state of cellular signaling systems provides insights to how cells behave under physiological and pathological conditions. Cellular signaling systems are organized as hierarchy (cascade) and signals of a molecular is often compositionally encoded to control cellular processes, such as gene expression. This project aims to develop advanced deep learning models (DLMs) to simulate cellular signaling systems based on gene expression data. In last 3 years, the project has made significant progresses, but the challenges remain. Importantly, contemporary DLMs behave as ?black boxes?, in that it is difficult to interpret how signals are encoded and how to interpret which signal a hidden node represent in a DLM. This black-box nature prevents researchers from gaining biological insights using DLMs, even though these models can be much superior in modeling data than other types of models in many tasks, e.g., predicting drug sensitivity of cancer cells. In this competitive renewal, we propose to develop novel DLMs and innovative inference algorithms to train ?interpretable? DLMs and apply them in translational research. The proposed research is innovative and of high significance in several perspectives: 1) Our novel DLMs and algorithms take advantage of big data resulting from systematic chemical/genetic perturbations of cellular signaling machinery, so that we can use the perturbation condition as side information to reveal how signals are encoded in a DLM. 2) We integrate principles of causal inference and information theory with deep learning method to make DLMs interpretable. As results, that researchers can gain mechanistic insights from such models. 3) Innovative application of interpretable DLMs will advance translational research. For example, we will train interpretable DLMs to model cellular signaling at the level of single cells and use this information investigate inter-cellular interactions among cells in tumor microenvironment to shed light on immune evasion mechanisms of cancers. We will also use information derived from interpretable DLMs to predict cancer cell drug sensitivity. We anticipate that our study will bring forth significant advances not only in deep learning methodology but also in precision medicine.

Public Health Relevance

This project aims to develop advance machine learning methods, referred to as deep learning models, to simulate cellular signaling systems, at both multiple cell and single cell levels. Success of these models will enable researchers to investigate cellular behaviors under physiological and pathological condition, and such information can be used to guide therapy of cancer patients.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
2R01LM012011-05A1
Application #
9972153
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2015-04-01
Project End
2024-03-31
Budget Start
2020-06-01
Budget End
2021-03-31
Support Year
5
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Pittsburgh
Department
Miscellaneous
Type
Schools of Medicine
DUNS #
004514360
City
Pittsburgh
State
PA
Country
United States
Zip Code
15260
Ding, Michael Q; Chen, Lujia; Cooper, Gregory F et al. (2018) Precision Oncology beyond Targeted Therapy: Combining Omics Data with Machine Learning Matches the Majority of Cancer Cells to Effective Therapeutics. Mol Cancer Res 16:269-278
Young, Jonathan D; Cai, Chunhui; Lu, Xinghua (2017) Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma. BMC Bioinformatics 18:381
Yan, Gaibo; Chen, Vicky; Lu, Xinghua et al. (2017) A signal-based method for finding driver modules of breast cancer metastasis to the lung. Sci Rep 7:10023
Huang, Tianzhi; Kim, Chung Kwon; Alvarez, Angel A et al. (2017) MST4 Phosphorylation of ATG4B Regulates Autophagic Activity, Tumorigenicity, and Radioresistance in Glioblastoma. Cancer Cell 32:840-855.e8
Chen, Vicky; Paisley, John; Lu, Xinghua (2017) Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling. BMC Genomics 18:105
Huang, Tianzhi; Alvarez, Angel A; Pangeni, Rajendra P et al. (2016) A regulatory circuit of miR-125b/miR-20b and Wnt signalling controls glioblastoma phenotypes through FZD6-modulated pathways. Nat Commun 7:12885
Hill, Steven M; Heiser, Laura M; Cokelaer, Thomas et al. (2016) Inferring causal molecular networks: empirical assessment through a community-based effort. Nat Methods 13:310-8
Chen, Lujia; Cai, Chunhui; Chen, Vicky et al. (2016) Learning a hierarchical representation of the yeast transcriptomic machinery using an autoencoder model. BMC Bioinformatics 17 Suppl 1:9
Lu, Songjian; Cai, Chunhui; Yan, Gonghong et al. (2016) Signal-Oriented Pathway Analyses Reveal a Signaling Complex as a Synthetic Lethal Target for p53 Mutations. Cancer Res 76:6785-6794
Lu, Songjian; Mandava, Gunasheil; Yan, Gaibo et al. (2016) An exact algorithm for finding cancer driver somatic genome alterations: the weighted mutually exclusive maximum set cover problem. Algorithms Mol Biol 11:11

Showing the most recent 10 out of 16 publications