Heritability analysis in the largest whole genome sequence (WGS) dataset, the NHLBI Trans-omics for Precision Medicine Whole Genome Sequencing Program (TOPMed), strongly suggested that ?missing heritability? can be attributed to rare variants that are not well targeted by array-based genotype variants. Large genome wide association studies (GWAS), complemented by whole genome sequencing studies (WGS), will be a cost efficient strategy to identify genetic variants and understand the genetic architecture of complex traits. Multiple large Biobanks with SNP-array data and whole genome sequencing data, such as the NHLBI Trans-omics for Precision Medicine Whole Genome Sequencing Program (TOPMed), provide an unprecedented but challenging opportunity to understand the genetic mechanisms underlying complex diseases. We have identified three pressing challenges in utilizing large GWAS and WGS datasets and propose the following four specific aims to meet the challenges: 1) Differentiate horizontal pleiotropy from mediation using GWAS summary statistics and apply the methods to publicly existing data. 2) Prioritize genetic variants sensitive to interactions, and estimate the overall contribution of interactions to a phenotype. 3) Incorporate family linkage/local ancestry to identify genetic variants in the TOPMed whole genome sequencing data. 4) Develop corresponding software that will be made publicly available. We will apply our new analytic methods to TOPMED WGS, UK Biobank data and many existing GWAS summary statistics. Our data analysis will focus on blood pressure, obesity and sleep disorders, and their effects on disease outcomes such as cardiovascular disease, diabetes, heart failure and dementia.
A large amount of genetic data on complex traits, such as blood pressure, obesity and sleep disorders, has been accumulated. But the genetic architecture of these traits is still poorly understood, and the knowledge gained has limited use in clinical application. In this proposal we propose to develop novel statistical methods and software tools for analyzing multiple correlated traits using available summary statistics to address the causal relationships among such traits, for prioritizing genetic variants sensitive to environmental interaction effects, for estimating the overall contribution of interactions to a particular trait, and for detecting causal rare genetic variants from whole genome sequencing data. We will apply the new methods to large available genome-wide summary statistics, the UK biobank data and the whole genome sequencing data from the NHLBI Trans-omics for Precision Medicine (TOPMed). We will focus on genes that predispose to blood pressure, obesity and sleep disorders, and their effects on other disease outcomes, including cardiovascular disease, diabetes, heart failure and dementia.