We propose to develop new statistical methods for studying gene x environment (GxE) interactions using data from molecular epidemiology studies. The focus is on targeted studies, which use single cell gel electrophoresis to measure DNA damage. This technology has great potential for study of GxE, since one can assess how the distribution of DNA damage across cells from an individual varies between experimental conditions. By drawing from cell lines for individuals with known genotype, the NIEHS Comet GxE study seeks to identify single nucleotide polymorphisms (SNPs) related to baseline DNA damage, susceptibility to genotoxic exposures, and repair rate. The phenotype for an individual in such studies is a collection of distributions corresponding to cell-specific DNA damage under different conditions. New methods are needed to efficiently analyze such distributional profiles, while allowing heterogeneity among subjects and SNP selection. The ability to detect GxE interactions is of great public health importance, allowing physicians to better identify patients that are more sensitive to a drug therapy or environmental exposure. Targeted molecular epidemiology studies provide an efficient alternative to traditional epidemiologic designs. Our goals include the following. 1. Develop nonparametric Bayesian statistical methods that allow a distributional profile to vary flexibly across individuals and with predictors, while allowing variable selection. 2. Apply these methods to data from the NIEHS Comet GxE Study to select SNPs associated with baseline DNA damage, susceptibility and repair rates. 3. Develop approaches for including outside information on each SNP, including whether it is in the coding region, is synonymous, is non-synonymous but at a location at which an amino acid change is likely to be damaging, or is in an intron or flanking sequence but is likely to impact gene expression. 4. An additional goal is to develop approximate Bayes methods that can be implemented rapidly, while encouraging sparse modeling of distributional profiles.

Public Health Relevance

The development of complex diseases, such as cancer and diabetes, depends on the interaction between genetic predisposition and a variety of lifestyle factors, including diet and environmental exposures. Identifying gene-environment interactions is a critical step in obtaining a better understanding of disease etiology, while also developing more effective personalized prevention and treatment strategies. We provide the statistical tools necessary to efficiently detect gene-environment interactions utilizing data from innovative new molecular epidemiology designs.

Agency
National Institute of Health (NIH)
Institute
National Institute of Environmental Health Sciences (NIEHS)
Type
Research Project (R01)
Project #
5R01ES017436-05
Application #
8496781
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Dilworth, Caroline H
Project Start
2009-09-25
Project End
2014-06-30
Budget Start
2013-07-01
Budget End
2014-06-30
Support Year
5
Fiscal Year
2013
Total Cost
$337,136
Indirect Cost
$121,023
Name
Duke University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
044387793
City
Durham
State
NC
Country
United States
Zip Code
27705
Wheeler, Matthew W; Dunson, David B; Pandalai, Sudha P et al. (2014) Mechanistic Hierarchical Gaussian Processes. J Am Stat Assoc 109:894-904
Bhattacharya, Anirban; Pati, Debdeep; Dunson, David (2014) ANISOTROPIC FUNCTION ESTIMATION USING MULTI-BANDWIDTH GAUSSIAN PROCESSES. Ann Stat 42:352-381
Kessler, David C; Taylor, Jack A; Dunson, David B (2014) Learning phenotype densities conditional on many interacting predictors. Bioinformatics 30:1562-8
Cui, Kai; Dunson, David B (2014) Generalized Dynamic Factor Models for Mixed-Measurement Time Series. J Comput Graph Stat 23:169-191
Lock, Eric F; Dunson, David B (2013) Bayesian consensus clustering. Bioinformatics 29:2610-6
Lock, Eric F; Hoadley, Katherine A; Marron, J S et al. (2013) JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES. Ann Appl Stat 7:523-542
Craddock, R Cameron; Jbabdi, Saad; Yan, Chao-Gan et al. (2013) Imaging human connectomes at the macroscale. Nat Methods 10:524-39
Zhang, Jenny; Grubor, Vladimir; Love, Cassandra L et al. (2013) Genetic heterogeneity of diffuse large B-cell lymphoma. Proc Natl Acad Sci U S A 110:1398-403
Minsker, Stanislav (2013) Estimation of Extreme Values and Associated Level Sets of a Regression Function via Selective Sampling. JMLR Workshop Conf Proc :105-121
Murray, Jared S; Dunson, David B; Carin, Lawrence et al. (2013) Bayesian Gaussian Copula Factor Models for Mixed Data. J Am Stat Assoc 108:656-665

Showing the most recent 10 out of 17 publications