Many genome-wide association studies (GWASs) have been published on complex diseases, where data on a large number of single nucleotide polymorphisms (SNPs) are collected to study the association between these SNPs and a disease. Although new loci are found to be associated, together they generally explain very little of the additive genetic variation (heritability) estimated from biometric modeling of these traits in families. Recently a series of papers 1,2 demonstrated that much of the heritability is not `missing' but is `hidden', unable to be detected with existing sample sizes. They estimated the total variance explained by the genome-wide SNPs (SNP-based heritability) using the correlation among the individuals from population-based unrelated samples. These methods are now being widely used to determine SNP-based heritability of many phenotypes.3,4,5 However, several recent studies have questioned the accuracy of this heritability estimation.6,7,8 Moreover, the influence of genes on these diseases may also vary over time through interaction with factors such as age, culture or on other location and time-dependent environmental factors. This proposal responds to the need by developing advanced statistical methods and efficient computing algorithms to analyze GWAS data for robust estimation of SNP-based heritability. We propose to develop computationally efficient methods to investigate heritability in a cohort study by developing a dimension reduction approach motivated by the Gaussian predictive process model.9 We aim to utilize the spatial dependency in the genetic relationships among the sampled individuals to dictate this dimension reduction. The proposed methods are motivated by and will be applied to the data from Minnesota Center for Twin and Family Research (MCTFR), a longitudinal genome-wide study on genes and environments and their interactions with different behavioral traits. We intend to study SNP-based heritability for different substance use disorders (SUDs). Similar phenotypes considered in the MCTFR study will be used to derive SNP-based heritability estimates in the UK Biobank cohort data. Open access user-friendly statistical software will be developed and distributed.

Public Health Relevance

We propose computationally efficient statistical approaches for robust estimation of heritability from genome- wide genetic and environmental data on individuals in large cohort studies. Our proposed approaches will provide insight into the relative contribution of genetic and environmental factors to phenotypic variation, and will provide an upper bound on genetic contribution for the utility of genetic risk prediction models. We will implement these proposed approaches on UK Biobank data and Minnesota Center for Twin and Family Research (MCTFR) dataset to characterize gene-environment interplay to efficiently estimate heritability of substance use disorders.

Agency
National Institute of Health (NIH)
Institute
National Institute on Drug Abuse (NIDA)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21DA046188-02
Application #
9746657
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Lossie, Amy C
Project Start
2018-08-01
Project End
2020-07-31
Budget Start
2019-08-01
Budget End
2020-07-31
Support Year
2
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of Minnesota Twin Cities
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
555917996
City
Minneapolis
State
MN
Country
United States
Zip Code
55455