Gene expression is an important molecular phenotype, providing the initial step in bridging the divide between static genomic information and dynamic organismal phenotypes. A nearly ubiquitous observation of studies performed to date is that cis-regulatory variation, primarily located in non-coding regions, is pervasive and a significant source of heritable gene expression variation. However, the functional consequences of non-coding variation have been difficult to assess on a genome-wide scale. Recently, digital DNAseI footprinting has emerged as a powerful approach to identify in vivo DNA-protein interactions. To this end, in Aim 1 we will leverage the power of digital DNAseI footprinting to systematically interrogate the functional significance of non-coding variation by developing a comprehensive and nucleotide level resolution map of in vivo DNA-protein interactions in 40 genetically diverse yeast strains and species (38 strains of Saccharomyces cerevisae, one strain of S. paradoxus, and one strain of S. bayanus). These data will yield fundamental insights into natural variation of in vivo protein binding site variation and the evolutionary forces shaping patterns of regulatory sequence variation within and between species.
In Aim 2, we will perform deep RNA-Seq on all 40 strains and species and correlate patterns of polymorphisms that lead to variation in in vivo DNA- protein interactions with gene expression levels, providing one of the largest compendiums of functional regulatory alleles generated to date. Importantly, we will also use this unique dataset to develop statistical methods for predicting functionally significant non-coding variation. The successful completion of the proposed project will provide the foundation for a more principled understanding of non-coding variation, facilitate the translation of static genomic information into predictive and quantitative models of transcript abundance, enable the interpretation of sequence variation in the context of personal genomics initiatives, and yield new insights into the evolution of gene expression levels.

Public Health Relevance

Gene expression is an important step in the process of converting DNA sequence information into phenotypes, and disrupting how much or when a gene is made can lead to disease. The proposed project will develop important new experimental and statistical tools to understand mutations that influence gene expression levels, which will be critical for interpreting, understanding, and ameliorating human disease.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM098360-02
Application #
8305471
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Krasnewich, Donna M
Project Start
2011-08-01
Project End
2015-04-30
Budget Start
2012-05-01
Budget End
2013-04-30
Support Year
2
Fiscal Year
2012
Total Cost
$293,509
Indirect Cost
$100,070
Name
University of Washington
Department
Genetics
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Connelly, Caitlin F; Wakefield, Jon; Akey, Joshua M (2014) Evolution and genetic architecture of chromatin accessibility and function in yeast. PLoS Genet 10:e1004427
Fu, Wenqing; Akey, Joshua M (2013) Selection and adaptation in the human genome. Annu Rev Genomics Hum Genet 14:467-89
Connelly, Caitlin F; Skelly, Daniel A; Dunham, Maitreya J et al. (2013) Population genomics and transcriptional consequences of regulatory motif variation in globally diverse Saccharomyces cerevisiae strains. Mol Biol Evol 30:1605-13
Skelly, Daniel A; Merrihew, Gennifer E; Riffle, Michael et al. (2013) Integrative phenomics reveals insight into the structure of phenotypic diversity in budding yeast. Genome Res 23:1496-504