Although efforts to identify genetic variations associated with disease traits have lead to a respectable number of newly discovered disease SNPs, there are many open challenges in this area. These GWAS SNPs have tended to explain a low proportion of variation in the traits and have functionally ambiguous roles. A number of publications have provided evidence that simultaneously considering genetic variation and gene expression in relevant tissue types leads to a more comprehensive characterization of the molecular basis of complex human traits. In particular, gene expression variation captures influences from both genetics and environment, which both ultimately contribute to complex traits. Therefore, a problem of much interest is to be able to quantitatively characterize the genetic, environmental, and their interactive contributions to gene expression variation. We propose to build a quantitative framework for discovering and dissecting gene-by-environment (G x E) interactions explaining variation of high-dimensional traits, such as gene expression. We further propose to apply the methodology to several cutting edge data sets and also make software available so that the methods may be widely utilized in future studies. One of the key challenges that we will tackle is to resolve the inconsistency of the statistical definition of interaction with the biological definition. We show that the statistical definition is not invariant to changes in the scale on which the trait is placed. Instead we develop a statistical definition where the interaction exists irrespective of changes to the scale and also agrees with the biological definition. This is particularly important for gene expression data, where the scale on which the data are analyzed is determined by technology and not a direct physical measure of the process by which the complex trait is manifested.

Public Health Relevance

It is well known that both genetic and environmental factors contribute to human disease. This proposal will provide a better understanding of the interaction between these two factors, especially as it is manifested through gene expression levels. The successful completion of this research will provide a better understanding of the biological basis of complex human diseases.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Ramos, Erin
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Princeton University
Organized Research Units
United States
Zip Code
Robinson, David G; Wang, Jean Y; Storey, John D (2015) A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays. Nucleic Acids Res 43:e131
Robinson, David G; Chen, Wei; Storey, John D et al. (2014) Design and analysis of Bar-seq experiments. G3 (Bethesda) 4:11-8