Integrating large-scale genomics data has huge potential to accelerate the identification of disease genes in human. Three major challenges lie in the current integrative approach for predicting disease genes. First, previous integrations in general limit genomic data input to one species at a time, while disease datasets are often generated in multiple model organisms. Second, public functional genomic datasets are dominated and biased by certain data types and accessible tissues, which can be addressed by expert curation of input datasets. Third, when multiple tissue-specific networks have been generated, a mathematical formulation is lacking to prioritize among these competing networks for the specific disease under consideration. This collaborative proposal aims at addressing the above challenges by exploring a prototype of bioinformatics tools to integrate multiple relevant global and tissue-specific networks across mammalian species targeting a specific disease, here ataxia. This proposal is based on our preliminary data in developing both global and cerebellum-specific networks to prioritize ataxia associated genes, and on the two PIs'complementary expertise in genomic data integration and experimental ataxia gene confirmation. We will 1) use domain-specific and multiple species data to establish global, brain, cerebellum, related tissue, and ataxia-specific networks, and develop web tools to explore these networks;and 2) develop multiple kernel learning algorithms to weigh and integrate multiple networks to predict ataxia-associated genes. Although the algorithms will be developed targeting ataxia only, we envision that this expert-driven integrative approach will be adaptable to other disease gene identification scenarios.

Public Health Relevance

Computational networks generated through large-scale genomic data integration can help place genes, proteins and their mutations into functional context. Traditional functional networks do not address tissue-specificity and are limited to single species whereas animal models have often significantly informed human disease research. We propose a strategy to integrate multiple tissue-specific functional networks to prioritize disease genes through incorporating expert knowledge input and genomics data from multiple mammalian species. We will develop our strategy in the context of a rare genetic disease, ataxia, for which we have extensive expert knowledge on collecting relevant genomic datasets and gathered initial candidate gene set to test. We expect the successful implementation of our pipeline will become a prototype for gene identification in other diseases using integrated tissue-specific networks, which may eventually be brought to clinical settings in which DNA from subjects with genetic disorders of unknown cause are being sequenced.

National Institute of Health (NIH)
National Institute of Neurological Disorders and Stroke (NINDS)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Genetic Variation and Evolution Study Section (GVE)
Program Officer
Gwinn, Katrina
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Biostatistics & Other Math Sci
Schools of Medicine
Ann Arbor
United States
Zip Code
Duda, Marlena; Zhang, Hongjiu; Li, Hong-Dong et al. (2018) Brain-specific functional relationship networks inform autism spectrum disorder gene prediction. Transl Psychiatry 8:56
Li, Hong-Dong; Omenn, Gilbert S; Guan, Yuanfang (2016) A proteogenomic approach to understand splice isoform functions through sequence and expression-based computational modeling. Brief Bioinform 17:1024-1031
Panwar, Bharat; Menon, Rajasree; Eksi, Ridvan et al. (2016) Genome-Wide Functional Annotation of Human Protein-Coding Splice Variants Using Multiple Instance Learning. J Proteome Res 15:1747-53
Zhu, Fan; Panwar, Bharat; Guan, Yuanfang (2016) Algorithms for modeling global and context-specific functional relationship networks. Brief Bioinform 17:686-95
Guan, Yuanfang; Martini, Sebastian; Mariani, Laura H (2015) Genes Caught In Flagranti: Integrating Renal Transcriptional Profiles With Genotypes and Phenotypes. Semin Nephrol 35:237-44
Li, Hong-Dong; Menon, Rajasree; Govindarajoo, Brandon et al. (2015) Functional Networks of Highest-Connected Splice Isoforms: From The Chromosome 17 Human Proteome Project. J Proteome Res 14:3484-91
Zhu, Fan; Shi, Lihong; Engel, James Douglas et al. (2015) Regulatory network inferred using expression data of small sample size: application and validation in erythroid system. Bioinformatics 31:2537-44
Panwar, Bharat; Menon, Rajasree; Eksi, Ridvan et al. (2015) MI-PVT: A Tool for Visualizing the Chromosome-Centric Human Proteome. J Proteome Res 14:3762-7
Li, Hong-Dong; Omenn, Gilbert S; Guan, Yuanfang (2015) MIsoMine: a genome-scale high-resolution data portal of expression, function and networks at the splice isoform level in the mouse. Database (Oxford) 2015:bav045
Shi, Lihong; Sierant, M C; Gurdziel, Katherine et al. (2014) Biased, non-equivalent gene-proximal and -distal binding motifs of orphan nuclear receptor TR4 in primary human erythroid cells. PLoS Genet 10:e1004339

Showing the most recent 10 out of 17 publications