Cancer is a consequence of the accumulation of genetic alterations. Large whole-genome scale resequencing projects such as The Cancer Genome Atlas (TCGA) have been launched in an effort to comprehensively catalog the genomic mutations and epigenetic modifications that are associated with cancer. It is essential to identify cancer-causing genes and pathways to gain insight into the disease mechanisms and hence facilitate early diagnosis and optimal treatment. However, identifying cancer-causing genes and their functional pathways remains challenging due to the complex biological interactions and the heterogeneity of diseases. Genetic mutations in disease-causing genes can disturb signaling pathways that impact the expression of a set of genes performing certain biological functions. We refer to a set of such genes as a functional module. We hypothesize that driver mutations, that is, mutations that lead to cancer progression, are likely to affect common disease-associated functional modules, and the causal relationship between the mutations and the perturbed signals of the modules can be reconstructed from gene expression data and protein interaction data. In this project, we will develop a novel approach to infer disease-causing genes and networks by integrating information from multiple types of data including genomic variations, gene expression and protein interactions. We first dynamically identify disease-associated modules that consist of a set of interacting genes, then develop a Bayesian-based approach to infer causative genes from the disease-associated modules. Then, by developing a stochastic search based method, we can determine the paths connecting causative genes and gene modules. As a result, disease- related pathways are inferred from the paths. Furthermore, we will integrate those pathways with the human interactome to discover higher-level disease-associated networks. In addition, we will develop machine learning based classifiers to predict disease types and clinical outcomes utilizing the molecular signatures identified in this project, such as differentially expressed gene modules and causative genes. Our computational framework and classifiers will be made available to the research community via a webserver. The PI serves as the university bioinformatics program director and has extensive teaching and research experience. A goal of this project is also to provide scientific research training to students and o help students to gain biological insight through their involvement with the project. Students will learn practical scientific computing skills from the PI and develop their own computational approaches to solving specific biomedical problems under the guidance of the PI. Thus the project will serve as an effective learning-research model in bioinformatics.

Public Health Relevance

In this project, we will develop and implement a novel framework to identify causative mutations and pathways and study cancer disease mechanisms by reconstructing signaling pathways. This will help to advance personalized medicine and lead to a greater understanding of cancer mechanisms, which can lay the foundation for improving early diagnosis and treatment planning, as well as the discovery of new therapeutic targets.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Academic Research Enhancement Awards (AREA) (R15)
Project #
1R15GM114739-01
Application #
8880075
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Brazhnik, Paul
Project Start
2015-07-01
Project End
2018-06-30
Budget Start
2015-07-01
Budget End
2018-06-30
Support Year
1
Fiscal Year
2015
Total Cost
Indirect Cost
Name
University of Arkansas at Little Rock
Department
Biostatistics & Other Math Sci
Type
Biomed Engr/Col Engr/Engr Sta
DUNS #
036725083
City
Little Rock
State
AR
Country
United States
Zip Code
72204
Cogill, Steven B; Srivastava, Anand K; Yang, Mary Qu et al. (2018) Co-expression of long non-coding RNAs and autism risk genes in the developing human brain. BMC Syst Biol 12:91
Causey, Jason L; Ashby, Cody; Walker, Karl et al. (2018) DNAp: A Pipeline for DNA-seq Data Analysis. Sci Rep 8:6793
Li, Dan; Yang, William; Arthur, Carolyn et al. (2018) Systems biology analysis reveals new insights into invasive lung cancer. BMC Syst Biol 12:117
Yang, Mary Qu; Weissman, Sherman M; Yang, William et al. (2018) MISC: missing imputation for single-cell RNA sequencing data. BMC Syst Biol 12:114
Li, Dan; Yang, William; Zhang, Yifan et al. (2018) Genomic analyses based on pulmonary adenocarcinoma in situ reveal early lung cancer signature. BMC Med Genomics 11:106
He, Guannan; Liang, Yanchun; Chen, Yan et al. (2018) A hotspots analysis-relation discovery representation model for revealing diabetes mellitus and obesity. BMC Syst Biol 12:116
Guan, Renchu; Wang, Xu; Yang, Mary Qu et al. (2018) Multi-label Deep Learning for Gene Function Annotation in Cancer Pathways. Sci Rep 8:267
Zhang, Yifan; Yang, William; Li, Dan et al. (2018) Toward the precision breast cancer survival prediction utilizing combined whole genome-wide expression and somatic mutation analysis. BMC Med Genomics 11:104
Li, Dan; Yang, William; Zhang, Jialing et al. (2018) Transcription Factor and lncRNA Regulatory Networks Identify Key Elements in Lung Adenocarcinoma. Genes (Basel) 9:
Johann Jr, Donald J; Steliga, Mathew; Shin, Ik J et al. (2018) Liquid biopsy and its role in an advanced clinical trial for lung cancer. Exp Biol Med (Maywood) 243:262-271

Showing the most recent 10 out of 23 publications