The long-term goal of the proposed research is to investigate understudied genetic mechanisms that are hypothesized to influence common diseases. Genetic analyses of complex traits have been largely performed within populations of individuals of the same ancestry, mainly of European descent. Besides being ethically questionable, this is problematic for disease risk prediction as it has been shown that prediction accuracy declines proportionally to increasing genetic divergence between training samples and target samples. One hypothesis for this observation is that different populations are likely exposed to different contexts (e.g., environmental conditions), which results in different effect sizes across populations in the presence of genotype-by-context interactions. In addition, context-dependent effects can influence prediction accuracy substantially even between groups (e.g., different sexes) of the same ancestry. Thus, prediction models that account for gene-by-context interactions could perform better than standard prediction models for disease risk in humans. While such models have provided increased accuracy in agricultural and model species, this topic has not yet been investigated in humans. This proposal will fill this gap by investigating the importance of gene-by-context interactions to the genetic architecture of blood pressure traits in multi-ancestry samples, and their incorporation into statistical models to increase the accuracy of phenotypic prediction. Blood pressure traits are very important medical traits (e.g., they a risk factor for the leading cause of death worldwide, cardiovascular disease) and are also excellent models of complex traits (they are moderately heritable traits, common variants alone explain only less than half of the total heritability, and GWAS hits explain only a few percent of the total variation). The proposed research will make use of publicly available large datasets, including (but not limited to) the UK Biobank and those being part of the Trans-Omics for Precision Medicine (TOPMed) consortium.
In Specific Aim 1, the focus will be on estimating the proportion of variance explained by and map gene-by-context interactions in multi-ancestry samples using a combination of already existing linear mixed models and Bayesian methods.
In Specific Aim 2, the focus will be on increasing prediction accuracy in both single-ancestry and multi-ancestry samples by incorporating gene-by-context interactions into prediction models. While existing linear mixed models and Bayesian methods developed for agricultural data will be applied, a new prediction method better suited to human data will also be developed. Briefly, the main idea is to model gene-by-context interactions explicitly for the available contexts, while also accounting for other unknown sources of effect heterogeneity among ancestries. This proposal will provide novel insights into the genetic architecture of blood pressure traits that will improve prediction accuracy in multi-ancestry samples as well as a novel analysis strategy/methodology that can be applied to any trait of interest.