Integration of multiscale genomic data for comprehensive analysis of complex dise

Wang, Yu-Ping

Abstract

Complex diseases are caused by a variety of genomics, transcriptomics, epigenomics, and proteomics factors and many studies have suggested that these different factors do not act in isolation, but rather interact/crosstalk at multiple levels and depend on one another in an intertwined manner. A variety of genomics techniques such as SNPs, microarray gene expressions, and the emerging next generation sequencing (NGS), have generated vast amount of multiscale genomic data, providing multi-dimensional and complementary information. However, currently these multiscale genomics data have not been well integrated and associated with clinical data for comprehensive analysis of a disease. The difficulty lies in the complexity and heterogeneity of these multi-omics data. In addition, the specific properties of these data (e.g., their correlations across multiple levels, small sample size but large number of biomarkers, group structures) have not been well considered, which necessitate a paradigm shift in the technical approaches. The goal of this project is therefore to tackle these significant bioinformatics challenges by developing innovative integration approaches such as sparse models by considering the specific features of multiscale genomic data. Furthermore, we will apply them to the diagnosis (e.g., identification of genes) and prediction of risks to complex diseases (e.g., osteoporosis). Our multi-/inter-disciplinary research team consisting of statisticians, geneticists, molecular biologists, bioinformaticians and biomedical engineers with complementary expertise has worked synergistically in the past few years and contributed significantly to the development of data integration approaches. Building on this work, we plan to accomplish the following specific aims: 1) To extract genetic signatures (e.g., CNVs) from multiple NGS samples and incorporate them into multi-omics studies; 2) To study the cross-talks/correlations between multi-omics data, from which epistatic networks can be detected; 3) To develop data integration techniques that can combine multiple genomic factors for the identification of risk genes and regions; and 4) To construct a sparse regression model to predict quantitative traits with increased power from multiple sources of genomic information including pathways and interaction networks. We will validate our model with the study of osteoporosis at Tulane Center for Bioinformatics and Genomics. With over 20,000 patients collected, to our knowledge, we have the largest and most comprehensive datasets, which will serve as a unique platform for validating our approaches. We anticipate that the project will have a large and sustained impact. The successful implementation of the project will enable us to 1) better elucidate specific genetic risk mechanisms for osteoporosis; 2) search for potential drug targets; and 3) ultimately obtain novel approaches for better prevention and treatment of osteoporosis. Upon the completion of the project, we will provide a set of efficient and powerful analytical tools for integrative data analysis, and make them freely available through our ongoing software development of GCATs (Genomic Convergence Analysis Tools) for multiscale genomic data management and analysis.

Public Health Relevance

The study of genetic mechanisms underlying complex diseases is of paramount importance for diagnosis and prognosis. Our integrated and comprehensive approach promises to greatly change current ways of genomic data analysis, e.g., without fully utilizing correlated and complementary information, and incorporating prior knowledge from multi-omics data. Given the ubiquitous use of multi-omics techniques in biomedicine, our approaches with a new paradigm on multiscale genomic data integration will therefore have a significant impact.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM109068-05
Application #: 9551656
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Krasnewich, Donna M

Project Start: 2014-09-17
Project End: 2019-08-31
Budget Start: 2018-09-01
Budget End: 2019-08-31
Support Year: 5
Fiscal Year: 2018
Total Cost
Indirect Cost

Institution

Name: Tulane University
Department: Biomedical Engineering
Type: Schools of Arts and Sciences
DUNS #: 053785812

City: New Orleans
State: LA
Country: United States
Zip Code: 70118

Related projects


NIH 2018 R01 GM	Integration of multiscale genomic data for comprehensive analysis of complex dise Wang, Yu-Ping / Tulane University
NIH 2017 R01 GM	Integration of multiscale genomic data for comprehensive analysis of complex dise Wang, Yu-Ping / Tulane University
NIH 2016 R01 GM	Integration of multiscale genomic data for comprehensive analysis of complex dise Wang, Yu-Ping / Tulane University
NIH 2015 R01 GM	Integration of multiscale genomic data for comprehensive analysis of complex dise Wang, Yu-Ping / Tulane University	$308,409
NIH 2014 R01 GM	Integration of multiscale genomic data for comprehensive analysis of complex dise Wang, Yu-Ping / Tulane University	$311,844

Publications

Liu, Li; Wen, Yan; Zhang, Lei et al. (2018) Assessing the Associations of Blood Metabolites With Osteoporosis: A Mendelian Randomization Study. J Clin Endocrinol Metab 103:1850-1855

Li, Yumei; Xiang, Yang; Xu, Chao et al. (2018) Rare variant association analysis in case-parents studies by allowing for missing parental genotypes. BMC Genet 19:7

Xu, Chao; Fang, Jian; Shen, Hui et al. (2018) EPS-LASSO: test for high-dimensional regression under extreme phenotype sampling of continuous traits. Bioinformatics 34:1996-2003

Zille, Pascal; Calhoun, Vince D; Wang, Yu-Ping (2018) Enforcing Co-Expression Within a Brain-Imaging Genomics Regression Framework. IEEE Trans Med Imaging 37:2561-2571

Lin, Xu; Peng, Cheng; Greenbaum, Jonathan et al. (2018) Identifying potentially common genes between dyslipidemia and osteoporosis using novel analytical approaches. Mol Genet Genomics 293:711-723

Alam, Md Ashad; Fukumizu, Kenji; Wang, Yu-Ping (2018) Influence Function and Robust Variant of Kernel Canonical Correlation Analysis. Neurocomputing 304:12-29

Fang, Jian; Xu, Chao; Zille, Pascal et al. (2018) Fast and Accurate Detection of Complex Imaging Genetics Associations Based on Greedy Projected Distance Correlation. IEEE Trans Med Imaging 37:860-870

Zille, Pascal; Calhoun, Vince D; Stephen, Julia M et al. (2018) Fused Estimation of Sparse Connectivity Patterns From Rest fMRI-Application to Comparison of Children and Adult Brains. IEEE Trans Med Imaging 37:2165-2175

Su-Ping Deng; Wenxing Hu; Calhoun, Vince D et al. (2018) Integrating Imaging Genomic Data in the Quest for Biomarkers of Schizophrenia Disease. IEEE/ACM Trans Comput Biol Bioinform 15:1480-1491

Hu, Wenxing; Lin, Dongdong; Cao, Shaolong et al. (2018) Adaptive Sparse Multiple Canonical Correlation Analysis With Application to Imaging (Epi)Genomics Study of Schizophrenia. IEEE Trans Biomed Eng 65:390-399

Showing the most recent 10 out of 55 publications

Comments

Be the first to comment on Yu-Ping Wang's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: