Complex diseases are caused by a variety of genomics, transcriptomics, epigenomics, and proteomics factors and many studies have suggested that these different factors do not act in isolation, but rather interact/crosstalk at multiple levels and depend on one another in an intertwined manner. A variety of genomics techniques such as SNPs, microarray gene expressions, and the emerging next generation sequencing (NGS), have generated vast amount of multiscale genomic data, providing multi-dimensional and complementary information. However, currently these multiscale genomics data have not been well integrated and associated with clinical data for comprehensive analysis of a disease. The difficulty lies in the complexity and heterogeneity of these multi-omics data. In addition, the specific properties of these data (e.g., their correlations across multiple levels, small sample size but large number of biomarkers, group structures) have not been well considered, which necessitate a paradigm shift in the technical approaches. The goal of this project is therefore to tackle these significant bioinformatics challenges by developing innovative integration approaches such as sparse models by considering the specific features of multiscale genomic data. Furthermore, we will apply them to the diagnosis (e.g., identification of genes) and prediction of risks to complex diseases (e.g., osteoporosis). Our multi-/inter-disciplinary research team consisting of statisticians, geneticists, molecular biologists, bioinformaticians and biomedical engineers with complementary expertise has worked synergistically in the past few years and contributed significantly to the development of data integration approaches. Building on this work, we plan to accomplish the following specific aims: 1) To extract genetic signatures (e.g., CNVs) from multiple NGS samples and incorporate them into multi-omics studies; 2) To study the cross-talks/correlations between multi-omics data, from which epistatic networks can be detected; 3) To develop data integration techniques that can combine multiple genomic factors for the identification of risk genes and regions; and 4) To construct a sparse regression model to predict quantitative traits with increased power from multiple sources of genomic information including pathways and interaction networks. We will validate our model with the study of osteoporosis at Tulane Center for Bioinformatics and Genomics. With over 20,000 patients collected, to our knowledge, we have the largest and most comprehensive datasets, which will serve as a unique platform for validating our approaches. We anticipate that the project will have a large and sustained impact. The successful implementation of the project will enable us to 1) better elucidate specific genetic risk mechanisms for osteoporosis; 2) search for potential drug targets; and 3) ultimately obtain novel approaches for better prevention and treatment of osteoporosis. Upon the completion of the project, we will provide a set of efficient and powerful analytical tools for integrative data analysis, and make them freely available through our ongoing software development of GCATs (Genomic Convergence Analysis Tools) for multiscale genomic data management and analysis.

Public Health Relevance

The study of genetic mechanisms underlying complex diseases is of paramount importance for diagnosis and prognosis. Our integrated and comprehensive approach promises to greatly change current ways of genomic data analysis, e.g., without fully utilizing correlated and complementary information, and incorporating prior knowledge from multi-omics data. Given the ubiquitous use of multi-omics techniques in biomedicine, our approaches with a new paradigm on multiscale genomic data integration will therefore have a significant impact.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Tulane University
Biomedical Engineering
Schools of Arts and Sciences
New Orleans
United States
Zip Code
Su-Ping Deng; Wenxing Hu; Calhoun, Vince D et al. (2018) Integrating Imaging Genomic Data in the Quest for Biomarkers of Schizophrenia Disease. IEEE/ACM Trans Comput Biol Bioinform 15:1480-1491
Hu, Wenxing; Lin, Dongdong; Cao, Shaolong et al. (2018) Adaptive Sparse Multiple Canonical Correlation Analysis With Application to Imaging (Epi)Genomics Study of Schizophrenia. IEEE Trans Biomed Eng 65:390-399
Gossmann, Alexej; Cao, Shaolong; Brzyski, Damian et al. (2018) A Sparse Regression Method for Group-Wise Feature Selection with False Discovery Rate Control. IEEE/ACM Trans Comput Biol Bioinform 15:1066-1078
Liu, Hui-Min; He, Jing-Yang; Zhang, Qiang et al. (2018) Improved detection of genetic loci in estimated glomerular filtration rate and type 2 diabetes using a pleiotropic cFDR method. Mol Genet Genomics 293:225-235
Lin, Dongdong; Chen, Jiayu; Ehrlich, Stefan et al. (2018) Cross-Tissue Exploration of Genetic and Epigenetic Effects on Brain Gray Matter in Schizophrenia. Schizophr Bull 44:443-452
Cai, Biao; Zille, Pascal; Stephen, Julia M et al. (2018) Estimation of Dynamic Sparse Connectivity Patterns From Resting State fMRI. IEEE Trans Med Imaging 37:1224-1234
Liang, Xiao; Wu, CuiYan; Zhao, Hongmou et al. (2018) Assessing the genetic correlations between early growth parameters and bone mineral density: A polygenic risk score analysis. Bone 116:301-306
Liu, Li; Wen, Yan; Zhang, Lei et al. (2018) Assessing the Associations of Blood Metabolites With Osteoporosis: A Mendelian Randomization Study. J Clin Endocrinol Metab 103:1850-1855
Li, Yumei; Xiang, Yang; Xu, Chao et al. (2018) Rare variant association analysis in case-parents studies by allowing for missing parental genotypes. BMC Genet 19:7
Xu, Chao; Fang, Jian; Shen, Hui et al. (2018) EPS-LASSO: test for high-dimensional regression under extreme phenotype sampling of continuous traits. Bioinformatics 34:1996-2003

Showing the most recent 10 out of 55 publications