Contemporary systems biology is shifting the paradigm of biomedical research from minimalistic studies of individual genes/proteins to integration of information at systems level. Current high throughput biotechnologies enable collection of a large amount of biological information, and the different aspects of the cellular systems are reflected with heterogeneous data, e.g., genomics, epigenomics, transcriptomics and metabolomics. However, it remains a major challenge to systematically integrate this body of information and derive biological insights at a mechanistic level. The overarching goal of this project is to develop a computational system that enables integration of various high throughput "omics" data (an "integromics" approach) to gain insights into cellular systems, in particular the signal transduction systems. The activities of the project are organized into four specific aims, which progress from approaches for capturing general information among the multiple omics data to more specific and complex models designed to decipher specific cellular signaling systems. Firstly, we will develop a general framework, based on information theory and probabilistic models, to identify information modules that convey biological information between different "omics" data at large scale. Secondly, we develop methods to further investigate if the information from the multiple omics data reflects causal relationships. Thirdly, we will develop tools to recover missing information from the system to augment the high throughput technologies. Finally, we will develop a unified model to elucidate signal transduction pathways by integrating information form multiple omics data in manner that is both biologically sensible and mathematically rigorous. We expect that the methodologies developed in the project are widely applicable to study a variety of cellular signal transduction systems.

Public Health Relevance

Cellular signaling systems play critical roles in terms of maintaining the normal physiology environment for cells, organs and human body;many human diseases, e.g., cancers and AIDS, are resulted from the disrupted cellular signaling systems. Investigating of cellular signaling systems will not only help to understand the mechanisms of disease but will also facilitate the discoveries of treatments. This project aim to develop novel computational methods to integrate the information regarding different aspects of cellular signaling systems in a biologically sensible and mathematically principled manner, which will lead both novel computational tools and biological discoveries.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1-ZH-C (M3))
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pittsburgh
Schools of Medicine
United States
Zip Code
Osmanbeyoglu, Hatice Ulku; Lu, Kevin N; Oesterreich, Steffi et al. (2013) Estrogen represses gene expression through reconfiguring chromatin structures. Nucleic Acids Res 41:8061-71
Lu, Songjian; Jin, Bo; Cowart, L Ashley et al. (2013) From data towards knowledge: revealing the architecture of signaling systems by unifying knowledge mining and data mining of systematic perturbation data. PLoS One 8:e61134
Day, Roger S; McDade, Kevin K (2013) A decision theory paradigm for evaluating identifier mapping and filtering methods using data integration. BMC Bioinformatics 14:223
Montefusco, David J; Chen, Lujia; Matmati, Nabil et al. (2013) Distinct signaling roles of ceramide species in yeast revealed through systematic perturbation and systems biology analyses. Sci Signal 6:rs14
Lu, Songjian; Lu, Xinghua (2013) Using graph models to find transcription factor modules: the hitting set problem and an exact algorithm. Algorithms Mol Biol 8:2
Osmanbeyoglu, Hatice Ulku; Hartmaier, Ryan J; Oesterreich, Steffi et al. (2012) Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information. BMC Genomics 13 Suppl 1:S1
Jin, Bo; Chen, Vicky; Chen, Lujia et al. (2011) Mapping annotations with textual evidence using an scLDA model. AMIA Annu Symp Proc 2011:834-42
Richards, Adam J; Muller, Brian; Shotwell, Matthew et al. (2010) Assessing the functional coherence of gene sets with metrics based on the Gene Ontology graph. Bioinformatics 26:i79-87
Jin, Bo; Lu, Xinghua (2010) Identifying informative subsets of the Gene Ontology with information bottleneck methods. Bioinformatics 26:2445-51
Cowart, L Ashley; Shotwell, Matthew; Worley, Mitchell L et al. (2010) Revealing a signaling role of phytosphingosine-1-phosphate in yeast. Mol Syst Biol 6:349