Systems-biology methods based on co-expression (co-exp) networks are powerful tools for understanding complex diseases. In a co-exp network, nodes represent genes, edges represent significant correlations between pairs of genes, and a module of connected nodes captures possible functional associations among the genes. Co-exp network methods have been successfully applied to understanding genetic architecture of human population and mechanisms of complex diseases, such as Alzheimer's disease and diabetes. Despite their success in biological and medical applications, the current co-exp network methods are unable to deal with genetic heterogeneity in the cohorts of samples of interest. Genetic heterogeneity is inherent in the cohorts of cases and controls in most complex disease studies. Therefore, failure to accommodate genetic heterogeneity will result in incorrect co-exp network structures and consequently lead to erroneous causal relationships between genetic variations and disease phenotypes. However, development of co-exp network methods that are adaptive in the presence of genetic heterogeneity is a challenge since no proper correlation measure that is resilient to genetic heterogeneity currently exists, which seriously limits the power and applicability of co-exp network analysis. To address this challenge, we introduce a new correlation measure of gene expression that is resilient to genetic heterogeneity and propose a novel individual-centric co-exp network approach to honor genetic heterogeneity. Our initial application of these methods to a set of gene expression data of Alzheimer's disease produced an impressive co-exp network module with coherent functions that are associated with the disease. This preliminary result provided the first set of convincing evidence on the validity of the new methods. In the proposed research, we will fully develop our novel co-expression network approach (Aim 1). In order to make the approach robust in the presence of genetic heterogeneity and noise in gene expression data, we will introduce a series of rigorous and unbiased tests for validating statistically and biological significant network modules (Aim 2). Furthermore, we will extend our approach to integrate the results of co-exp network modules with information of genetic variations to support genetics of gene expression studies (Aim 3). We will apply the new methods to Alzheimer's disease, psoriasis and prostate cancer, to examine the validity of our approach and more importantly, to gain deep insights into the genetic bases of these complex diseases that burden a substantial proportion of the human population (Aim 4). Finally, we will develop a software package of our methods, which will be freely available, and a web-based online service to the research community to ease the computational burden in complex disease studies (Aim 5). The proposed research represents a fundamental paradigm shift from conventional analyses of gene expression data and has the potential for significant advancements for the research of complex diseases as well as other population variations.

Public Health Relevance

A popular and successful approach for utilizing the growing amount of genome-wide gene expression profiling data is the co-expression network modeling and analysis. However, genetic heterogeneity, which is inherent in cohorts of disease cases and controls of interest, causes the existing co-expression network methods to fail or to be less effective. The proposed research aims to develop a new correlation measure for gene expression and a novel, individual-centric co-expression network approach to accommodate genetic heterogeneity;and subsequently utilize these methods to further our understandings of three complex diseases (Alzheimer's disease, psoriasis and prostate cancer) and for the development of an online computational service for the convenience of the research community.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Washington University
Biostatistics & Other Math Sci
Biomed Engr/Col Engr/Engr Sta
Saint Louis
United States
Zip Code
Li, Lili; Tian, Guangmei; Peng, Hai et al. (2018) New class of transcription factors controls flagellar assembly by recruiting RNA polymerase II in Chlamydomonas. Proc Natl Acad Sci U S A 115:4435-4440
Michael, Todd P; Bryant, Douglas; Gutierrez, Ryan et al. (2017) Comprehensive definition of genome features in Spirodela polyrhiza by high-depth physical mapping and short-read DNA sequencing strategies. Plant J 89:617-635
Fu, Jingcheng; Zhang, Weixiong; Wu, Jianliang (2017) Identification of leader and self-organizing communities in complex networks. Sci Rep 7:704
Li, Lun; Fang, Zhiwei; Zhou, Junfei et al. (2017) An accurate and efficient method for large-scale SSR genotyping and applications. Nucleic Acids Res 45:e88
Xia, Jing; Wang, Xiaoqin; Perroud, Pierre-Fran├žois et al. (2016) Endogenous Small-Noncoding RNAs and Potential Functions in Desiccation Tolerance in Physcomitrella Patens. Sci Rep 6:30118
Li, Ze-Yuan; Xia, Jing; Chen, Zheng et al. (2016) Large-scale rewiring of innate immunity circuitry and microRNA regulation during initial rice blast infection. Sci Rep 6:25493
Niu, Dongdong; Xia, Jing; Jiang, Chunhao et al. (2016) Bacillus cereus AR156 primes induced systemic resistance by suppressing miR825/825* and activating defense-related genes in Arabidopsis. J Integr Plant Biol 58:426-39
Chen, Lihong; Han, Jiapeng; Deng, Xiaomin et al. (2016) Expansion and stress responses of AP2/EREBP superfamily in Brachypodium distachyon. Sci Rep 6:21623
Tiosano, Dov; Audi, Laura; Climer, Sharlee et al. (2016) Latitudinal Clines of the Human Vitamin D Receptor and Skin Color Genes. G3 (Bethesda) 6:1251-66
He, Dongxiao; Jin, Di; Chen, Zheng et al. (2015) Identification of hybrid node and link communities in complex networks. Sci Rep 5:8638

Showing the most recent 10 out of 28 publications