Complex diseases are caused by a variety of genomics, transcriptomics, epigenomics, and proteomics factors and many studies have suggested that these different factors do not act in isolation, but rather interact/crosstalk at multiple levels and depend on one another in an intertwined manner. A variety of genomics techniques such as SNPs, microarray gene expressions, and the emerging next generation sequencing (NGS), have generated vast amount of multiscale genomic data, providing multi-dimensional and complementary information. However, currently these multiscale genomics data have not been well integrated and associated with clinical data for comprehensive analysis of a disease. The difficulty lies in the complexity and heterogeneity of these multi-omics data. In addition, the specific properties of these data (e.g., their correlations across multiple levels, small sample size but large number of biomarkers, group structures) have not been well considered, which necessitate a paradigm shift in the technical approaches. The goal of this project is therefore to tackle these significant bioinformatics challenges by developing innovative integration approaches such as sparse models by considering the specific features of multiscale genomic data. Furthermore, we will apply them to the diagnosis (e.g., identification of genes) and prediction of risks to complex diseases (e.g., osteoporosis). Our multi-/inter-disciplinary research team consisting of statisticians, geneticists, molecular biologists, bioinformaticians and biomedical engineers with complementary expertise has worked synergistically in the past few years and contributed significantly to the development of data integration approaches. Building on this work, we plan to accomplish the following specific aims: 1) To extract genetic signatures (e.g., CNVs) from multiple NGS samples and incorporate them into multi-omics studies;2) To study the cross-talks/correlations between multi-omics data, from which epistatic networks can be detected;3) To develop data integration techniques that can combine multiple genomic factors for the identification of risk genes and regions;and 4) To construct a sparse regression model to predict quantitative traits with increased power from multiple sources of genomic information including pathways and interaction networks. We will validate our model with the study of osteoporosis at Tulane Center for Bioinformatics and Genomics. With over 20,000 patients collected, to our knowledge, we have the largest and most comprehensive datasets, which will serve as a unique platform for validating our approaches. We anticipate that the project will have a large and sustained impact. The successful implementation of the project will enable us to 1) better elucidate specific genetic risk mechanisms for osteoporosis;2) search for potential drug targets;and 3) ultimately obtain novel approaches for better prevention and treatment of osteoporosis. Upon the completion of the project, we will provide a set of efficient and powerful analytical tools for integrative data analysis, and make them freely available through our ongoing software development of GCATs (Genomic Convergence Analysis Tools) for multiscale genomic data management and analysis.
The study of genetic mechanisms underlying complex diseases is of paramount importance for diagnosis and prognosis. Our integrated and comprehensive approach promises to greatly change current ways of genomic data analysis, e.g., without fully utilizing correlated and complementary information, and incorporating prior knowledge from multi-omics data. Given the ubiquitous use of multi-omics techniques in biomedicine, our approaches with a new paradigm on multiscale genomic data integration will therefore have a significant impact.
|Su-Ping Deng; Wenxing Hu; Calhoun, Vince D et al. (2018) Integrating Imaging Genomic Data in the Quest for Biomarkers of Schizophrenia Disease. IEEE/ACM Trans Comput Biol Bioinform 15:1480-1491|
|Hu, Wenxing; Lin, Dongdong; Cao, Shaolong et al. (2018) Adaptive Sparse Multiple Canonical Correlation Analysis With Application to Imaging (Epi)Genomics Study of Schizophrenia. IEEE Trans Biomed Eng 65:390-399|
|Gossmann, Alexej; Cao, Shaolong; Brzyski, Damian et al. (2018) A Sparse Regression Method for Group-Wise Feature Selection with False Discovery Rate Control. IEEE/ACM Trans Comput Biol Bioinform 15:1066-1078|
|Liu, Hui-Min; He, Jing-Yang; Zhang, Qiang et al. (2018) Improved detection of genetic loci in estimated glomerular filtration rate and type 2 diabetes using a pleiotropic cFDR method. Mol Genet Genomics 293:225-235|
|Lin, Dongdong; Chen, Jiayu; Ehrlich, Stefan et al. (2018) Cross-Tissue Exploration of Genetic and Epigenetic Effects on Brain Gray Matter in Schizophrenia. Schizophr Bull 44:443-452|
|Cai, Biao; Zille, Pascal; Stephen, Julia M et al. (2018) Estimation of Dynamic Sparse Connectivity Patterns From Resting State fMRI. IEEE Trans Med Imaging 37:1224-1234|
|Liang, Xiao; Wu, CuiYan; Zhao, Hongmou et al. (2018) Assessing the genetic correlations between early growth parameters and bone mineral density: A polygenic risk score analysis. Bone 116:301-306|
|Liu, Li; Wen, Yan; Zhang, Lei et al. (2018) Assessing the Associations of Blood Metabolites With Osteoporosis: A Mendelian Randomization Study. J Clin Endocrinol Metab 103:1850-1855|
|Li, Yumei; Xiang, Yang; Xu, Chao et al. (2018) Rare variant association analysis in case-parents studies by allowing for missing parental genotypes. BMC Genet 19:7|
|Xu, Chao; Fang, Jian; Shen, Hui et al. (2018) EPS-LASSO: test for high-dimensional regression under extreme phenotype sampling of continuous traits. Bioinformatics 34:1996-2003|
Showing the most recent 10 out of 55 publications