Integrating large-scale genomics data has huge potential to accelerate the identification of disease genes in human. Three major challenges lie in the current integrative approach for predicting disease genes. First, previous integrations in general limit genomic data input to one species at a time, while disease datasets are often generated in multiple model organisms. Second, public functional genomic datasets are dominated and biased by certain data types and accessible tissues, which can be addressed by expert curation of input datasets. Third, when multiple tissue-specific networks have been generated, a mathematical formulation is lacking to prioritize among these competing networks for the specific disease under consideration. This collaborative proposal aims at addressing the above challenges by exploring a prototype of bioinformatics tools to integrate multiple relevant global and tissue-specific networks across mammalian species targeting a specific disease, here ataxia. This proposal is based on our preliminary data in developing both global and cerebellum-specific networks to prioritize ataxia associated genes, and on the two PIs'complementary expertise in genomic data integration and experimental ataxia gene confirmation. We will 1) use domain-specific and multiple species data to establish global, brain, cerebellum, related tissue, and ataxia-specific networks, and develop web tools to explore these networks;and 2) develop multiple kernel learning algorithms to weigh and integrate multiple networks to predict ataxia-associated genes. Although the algorithms will be developed targeting ataxia only, we envision that this expert-driven integrative approach will be adaptable to other disease gene identification scenarios.
Computational networks generated through large-scale genomic data integration can help place genes, proteins and their mutations into functional context. Traditional functional networks do not address tissue-specificity and are limited to single species whereas animal models have often significantly informed human disease research. We propose a strategy to integrate multiple tissue-specific functional networks to prioritize disease genes through incorporating expert knowledge input and genomics data from multiple mammalian species. We will develop our strategy in the context of a rare genetic disease, ataxia, for which we have extensive expert knowledge on collecting relevant genomic datasets and gathered initial candidate gene set to test. We expect the successful implementation of our pipeline will become a prototype for gene identification in other diseases using integrated tissue-specific networks, which may eventually be brought to clinical settings in which DNA from subjects with genetic disorders of unknown cause are being sequenced.
|Duda, Marlena; Zhang, Hongjiu; Li, Hong-Dong et al. (2018) Brain-specific functional relationship networks inform autism spectrum disorder gene prediction. Transl Psychiatry 8:56|
|Li, Hong-Dong; Omenn, Gilbert S; Guan, Yuanfang (2016) A proteogenomic approach to understand splice isoform functions through sequence and expression-based computational modeling. Brief Bioinform 17:1024-1031|
|Panwar, Bharat; Menon, Rajasree; Eksi, Ridvan et al. (2016) Genome-Wide Functional Annotation of Human Protein-Coding Splice Variants Using Multiple Instance Learning. J Proteome Res 15:1747-53|
|Zhu, Fan; Panwar, Bharat; Guan, Yuanfang (2016) Algorithms for modeling global and context-specific functional relationship networks. Brief Bioinform 17:686-95|
|Guan, Yuanfang; Martini, Sebastian; Mariani, Laura H (2015) Genes Caught In Flagranti: Integrating Renal Transcriptional Profiles With Genotypes and Phenotypes. Semin Nephrol 35:237-44|
|Li, Hong-Dong; Menon, Rajasree; Govindarajoo, Brandon et al. (2015) Functional Networks of Highest-Connected Splice Isoforms: From The Chromosome 17 Human Proteome Project. J Proteome Res 14:3484-91|
|Zhu, Fan; Shi, Lihong; Engel, James Douglas et al. (2015) Regulatory network inferred using expression data of small sample size: application and validation in erythroid system. Bioinformatics 31:2537-44|
|Panwar, Bharat; Menon, Rajasree; Eksi, Ridvan et al. (2015) MI-PVT: A Tool for Visualizing the Chromosome-Centric Human Proteome. J Proteome Res 14:3762-7|
|Li, Hong-Dong; Omenn, Gilbert S; Guan, Yuanfang (2015) MIsoMine: a genome-scale high-resolution data portal of expression, function and networks at the splice isoform level in the mouse. Database (Oxford) 2015:bav045|
|Shi, Lihong; Sierant, M C; Gurdziel, Katherine et al. (2014) Biased, non-equivalent gene-proximal and -distal binding motifs of orphan nuclear receptor TR4 in primary human erythroid cells. PLoS Genet 10:e1004339|
Showing the most recent 10 out of 17 publications