We propose to develop new statistical methods for studying gene x environment (GxE) interactions using data from molecular epidemiology studies. The focus is on targeted studies, which use single cell gel electrophoresis to measure DNA damage. This technology has great potential for study of GxE, since one can assess how the distribution of DNA damage across cells from an individual varies between experimental conditions. By drawing from cell lines for individuals with known genotype, the NIEHS Comet GxE study seeks to identify single nucleotide polymorphisms (SNPs) related to baseline DNA damage, susceptibility to genotoxic exposures, and repair rate. The phenotype for an individual in such studies is a collection of distributions corresponding to cell-specific DNA damage under different conditions. New methods are needed to efficiently analyze such distributional profiles, while allowing heterogeneity among subjects and SNP selection. The ability to detect GxE interactions is of great public health importance, allowing physicians to better identify patients that are more sensitive to a drug therapy or environmental exposure. Targeted molecular epidemiology studies provide an efficient alternative to traditional epidemiologic designs. Our goals include the following. 1. Develop nonparametric Bayesian statistical methods that allow a distributional profile to vary flexibly across individuals and with predictors, while allowing variable selection. 2. Apply these methods to data from the NIEHS Comet GxE Study to select SNPs associated with baseline DNA damage, susceptibility and repair rates. 3. Develop approaches for including outside information on each SNP, including whether it is in the coding region, is synonymous, is non-synonymous but at a location at which an amino acid change is likely to be damaging, or is in an intron or flanking sequence but is likely to impact gene expression. 4. An additional goal is to develop approximate Bayes methods that can be implemented rapidly, while encouraging sparse modeling of distributional profiles.

Public Health Relevance

The development of complex diseases, such as cancer and diabetes, depends on the interaction between genetic predisposition and a variety of lifestyle factors, including diet and environmental exposures. Identifying gene-environment interactions is a critical step in obtaining a better understanding of disease etiology, while also developing more effective personalized prevention and treatment strategies. We provide the statistical tools necessary to efficiently detect gene-environment interactions utilizing data from innovative new molecular epidemiology designs.

National Institute of Health (NIH)
National Institute of Environmental Health Sciences (NIEHS)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Dilworth, Caroline H
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Duke University
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Lock, Eric F; Dunson, David B (2017) Bayesian genome- and epigenome-wide association studies with gene level dependence. Biometrics 73:1018-1028
Lock, Eric F; Soldano, Karen L; Garrett, Melanie E et al. (2015) Joint eQTL assessment of whole blood and dura mater tissue from individuals with Chiari type I malformation. BMC Genomics 16:11
Kessler, David C; Taylor, Jack A; Dunson, David B (2014) Learning phenotype densities conditional on many interacting predictors. Bioinformatics 30:1562-8
Wheeler, Matthew W; Dunson, David B; Pandalai, Sudha P et al. (2014) Mechanistic Hierarchical Gaussian Processes. J Am Stat Assoc 109:894-904
Cui, Kai; Dunson, David B (2014) Generalized Dynamic Factor Models for Mixed-Measurement Time Series. J Comput Graph Stat 23:169-191
Bhattacharya, Anirban; Pati, Debdeep; Dunson, David (2014) ANISOTROPIC FUNCTION ESTIMATION USING MULTI-BANDWIDTH GAUSSIAN PROCESSES. Ann Stat 42:352-381
Craddock, R Cameron; Jbabdi, Saad; Yan, Chao-Gan et al. (2013) Imaging human connectomes at the macroscale. Nat Methods 10:524-39
Lock, Eric F; Dunson, David B (2013) Bayesian consensus clustering. Bioinformatics 29:2610-6
Page, Garritt; Bhattacharya, Abhishek; Dunson, David (2013) Classification via Bayesian Nonparametric Learning of Affine Subspaces. J Am Stat Assoc 108:187-201
Minsker, Stanislav (2013) Estimation of Extreme Values and Associated Level Sets of a Regression Function via Selective Sampling. JMLR Workshop Conf Proc :105-121

Showing the most recent 10 out of 46 publications