Rapid progress in biomedical informatics has generated massive high-dimensional data sets (?big data?), ranging from clinical information and medical imaging to genomic sequence data. The scale and complexity of these data sets hold great promise, yet present substantial challenges. To fully exploit the potential informativeness of big data, there is an urgent need to find effective ways to integrate diverse data from different levels of informatics technologies. Existing approaches and methods for data integration to date have several important limitations. In this project, we propose novel statistical methods and strategies to integrate neuroimaging, multi-omics, and clinical/behavioral data sets. To increase power for association analysis compared to existing methods, we propose a novel multi-phenotype multi-variant association method that can evaluate the cumulative effect of common and rare variants in genes or regions of interest, incorporate prior biological knowledge on the multiple phenotype structure, identify associated phenotypes among multiple phenotypes, and be computationally efficient for high-dimensional phenotypes. To improve the prediction of clinical outcomes, we propose a novel machine learning strategy that can integrate multimodal neuroimaging and multi-omics data into a mathematical model and can incorporate prior biological knowledge to identify genomic interactions associated with clinical outcomes. The ongoing Alzheimer's Disease Neuroimaging Initiative (ADNI) and Indiana Memory and Aging Study (IMAS) projects as a test bed provide a unique opportunity to evaluate/validate the proposed methods.
Specific Aims :
Aim 1 : to develop powerful statistical methods for multivariate tests of associations between multiple phenotypes and a single genetic variant or set of variants (common and rare) in regions of interest, and to develop methods for mediation analysis to integrate neuroimaging, genetic, and clinical data to test for direct and indirect genetic effects mediated through neuroimaging phenotypes on clinical outcomes;
Aim 2 : to develop a novel multivariate model that combines multi-omics and neuroimaging data using a machine learning strategy to predict individuals with disease or those at high-risk for developing disease, and to develop a novel multivariate model incorporating prior biological knowledge to identify genomic interactions associated with clinical outcomes;
Aim 3 : to evaluate and validate the proposed methods using real data from the ADNI and IMAS cohorts;
and Aim 4 : to disseminate and support publicly available user-friendly software that efficiently implements the proposed methods.

Public Health Relevance

TO PUBLIC HEALTH: Alzheimer's disease (AD) as an exemplar is an increasingly common progressive neurodegenerative condition with no validated disease modifying treatment. The proposed multivariate methods are likely to help identify novel diagnostic biomarkers and therapeutic targets for AD. Identifying new susceptibility loci/biomarkers for AD has important implications for gaining greater insight into the molecular mechanisms underlying AD.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Indiana University-Purdue University at Indianapolis
Schools of Medicine
United States
Zip Code
Varma, Vijay R; Oommen, Anup M; Varma, Sudhir et al. (2018) Brain and blood metabolite signatures of pathology and progression in Alzheimer disease: A targeted metabolomics study. PLoS Med 15:e1002482
Miller, Jason E; Shivakumar, Manu K; Risacher, Shannon L et al. (2018) Codon bias among synonymous rare variants is associated with Alzheimer's disease imaging biomarker. Pac Symp Biocomput 23:365-376
Miller, Jason E; Shivakumar, Manu K; Lee, Younghee et al. (2018) Rare variants in the splicing regulatory elements of EXOC3L4 are associated with brain glucose metabolism in Alzheimer's disease. BMC Med Genomics 11:76
Yan, Qi; Nho, Kwangsik; Del-Aguila, Jorge L et al. (2018) Genome-wide association study of brain amyloid deposition as measured by Pittsburgh Compound-B (PiB)-PET imaging. Mol Psychiatry :
Dutta, Diptavo; Scott, Laura; Boehnke, Michael et al. (2018) Multi-SKAT: General framework to test for rare-variant association with multiple phenotypes. Genet Epidemiol :
Lee, Younghee; Han, Seonggyun; Kim, Dongwook et al. (2018) Genetic variation affecting exon skipping contributes to brain structural atrophy in Alzheimer's disease. AMIA Jt Summits Transl Sci Proc 2017:124-131
Wachinger, Christian; Nho, Kwangsik; Saykin, Andrew J et al. (2018) A Longitudinal Imaging Genetics Study of Neuroanatomical Asymmetry in Alzheimer's Disease. Biol Psychiatry 84:522-530
El-Manzalawy, Yasser; Hsieh, Tsung-Yu; Shivakumar, Manu et al. (2018) Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data. BMC Med Genomics 11:71
Apostolova, Liana G; Risacher, Shannon L; Duran, Tugce et al. (2018) Associations of the Top 20 Alzheimer Disease Risk Variants With Brain Amyloidosis. JAMA Neurol 75:328-341
Kim, Dokyoon; Basile, Anna O; Bang, Lisa et al. (2017) Knowledge-driven binning approach for rare variant association analysis: application to neuroimaging biomarkers in Alzheimer's disease. BMC Med Inform Decis Mak 17:61

Showing the most recent 10 out of 17 publications