Several genome-wide association studies (GWAS) have been published on various complex diseases, where genotype data on a large number of single nucleotide polymorphisms (SNPs) are collected to study the association between these SNPs and a disease. Although new loci are found to be associated with different diseases in these GWAS, they generally explain very little of the genetic risk for these diseases. Much of the remaining trait variation is likely to be due to the combined effect of genes, environmental factors, and their interactions. How- ever, most investigators conducting genome-wide association studies do not consider gene-environment (GxE) or gene-gene (GxG) interactions in their search for new genes. Moreover, most of these studies are cross-sectional. Complex diseases are frequently dynamic, varying over time with changing or accumulating environmental and physiological factors. The influence of genes on these diseases may also vary over time through interaction with factors such as age, developmental stage or other time-dependent environmental factors. Variation in the effects of genetic variants at different stages of life could significantly alter the trajectories of traits. Hence, studies that do not consider te possibility of longitudinal variation in genetic associations may lead to over-simplistic models of variant effects and hence lack power to detect them. This is in part due to a current lack of efficient statistical methods and corresponding software to detect the interplay of high-volume genetic data and time-dependent environmental factors. The purpose of this proposal responds to this urgent need by developing advanced statistical methods and efficient computing algorithms to analyze high-throughput data from gene-environment longitudinal studies with data on unrelated individuals as well as families. We propose to develop two efficient methods to detect GxE interactions in longitudinal studies. They are as follows: (1) to develop techniques for robust and efficient estimation of GxE interactions in longitudinal study designs using a likelihood-based dimension reduction approach; (2) to develop a powerful random-effect model for high-dimensional data to detect joint-effects of multiple SNPs and time-dependent environmental factors. The proposed methods are motivated by and to be applied to the Minnesota Center for Twin and Family Research (MCTFR) data, a longitudinal genome-wide study on genes and environments and their interactions with different behavioral traits. We intend to study the etiological underpinnings of substance use disorders (SUDs) derived from various interacting biological and psychosocial factors that work together dynamically over the course of development. Open access user-friendly statistical software will be developed and distributed.

Public Health Relevance

We propose powerful statistical approaches for detection of genome-wide gene-environment interactions using longitudinal data from families and unrelated individuals. Our proposed approaches will provide insights into the complex interplay between genes and environmental factors in the development of a disease. We will implement these proposed approaches on Minnesota Center for Twin and Family Research (MCTFR) dataset, a longitudinal prospective study of families to characterize the nature of gene-environment interplay in the development of substance use disorders.

National Institute of Health (NIH)
National Institute on Drug Abuse (NIDA)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Weinberg, Naimah Z
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Minnesota Twin Cities
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Park, Jun Young; Wu, Chong; Basu, Saonli et al. (2018) Adaptive SNP-Set Association Testing in Generalized Linear Mixed Models with Application to Family Studies. Behav Genet 48:55-66
Coombes, Brandon J; Basu, Saonli; McGue, Matt (2018) A linear mixed model framework for gene-based gene-environment interaction tests in twin studies. Genet Epidemiol 42:648-663
Yang, Yi; Basu, Saonli; Mirabello, Lisa et al. (2018) A Bayesian Gene-Based Genome-Wide Association Study Analysis of Osteosarcoma Trio Data Using a Hierarchically Structured Prior. Cancer Inform 17:1176935118775103
Arbet, Jaron; McGue, Matt; Chatterjee, Snigdhansu et al. (2017) Resampling-based tests for Lasso in genome-wide association studies. BMC Genet 18:70
Balabdaoui, Fadoua; Basu, Saonli (2017) Letter to the editor comments on Groparu-Cojocaru and Doray (2013). Commun Stat Simul Comput 46:3833-3840
Ray, Debashree; Basu, Saonli (2017) A novel association test for multiple secondary phenotypes from a case-control GWAS. Genet Epidemiol 41:413-426
Coombes, Brandon; Basu, Saonli; McGue, Matt (2017) A combination test for detection of gene-environment interaction in cohort studies. Genet Epidemiol 41:396-412
Ho, Yen-Yi; Guan, Weihua; O'Connell, Michael et al. (2016) Powerful association test combining rare variant and gene expression using family data from Genetic Analysis Workshop 19. BMC Proc 10:251-255
Ray, Debashree; Pankow, James S; Basu, Saonli (2016) USAT: A Unified Score-Based Association Test for Multiple Phenotype-Genotype Analysis. Genet Epidemiol 40:20-34
Coombes, Brandon; Basu, Saonli; Guha, Sharmistha et al. (2015) Weighted Score Tests Implementing Model-Averaging Schemes in Detection of Rare Variants in Case-Control Studies. PLoS One 10:e0139355

Showing the most recent 10 out of 13 publications