The long-term goal of this project is to understand the combinatorial rules that govern the interactions between transcription factor binding sites (TFBS). Through these rules, combinations of TFBS specify an enormous diversity of complex gene expression patterns. Normal growth and development depends on the tight control of TFBS over levels of gene expression in both time and space, and aberrant regulation of gene expression underlies many genetic diseases. Although much progress has been made identifying TFBS, and the transcription factors (TFs) that bind to them, much less is known about how TFBS interact with each other to generate specific patterns of gene expression. This lack of knowledge is manifested in the inability to predict the expression patterns specified by novel combinations of TFBS, and the inability to distinguish true regulatory regions in the genome from spurious clusters of TFBS. The proposal addresses these problems through experiments designed to unravel the mechanisms that govern TFBS interactions. Large libraries of simplified synthetic promoters will be constructed and assayed for expression and TF occupancy. The data from these libraries will be analyzed with a thermodynamic model that describes the physical interactions between TFBS, TFs and RNA polymerase. The models produced from synthetic promoters will be explicitly tested on genomic promoters.
In Aim 1 this system will be used to study the extent to which the simple occupancy of TFBS by TFs determines complex patterns of gene expression.
In Aim 2 the system will be extended to determine the effects of additional variables on combinatorial cis-regulation, including the strength of the TATA box, chromosomal location, and chromatin modifications. The successful completion of the aims of this proposal will result in a quantitative and molecular understanding of the rules underlying combinatorial cis-regulation. Such an understanding is necessary to empower biomedical applications, such as stem cell engineering, that are based on manipulating gene expression patterns. The results produced from this proposal will also help guide the annotation of the large regions of non-coding DNA in the genome that specify gene expression patterns. Finally, a clear understanding of TFBS interactions will help the identification and interpretation of disease causing genetic variants that affect cis-regulation.
In addition to serving as a parts list of genes, the genome also encodes information that controls precisely where, when, and to what levels genes are produced (expressed). Strict control of gene expression is critical for normal growth and development, and aberrant gene expression underlies many genetic diseases, including cancer. Successful completion of the experiments in this proposal will illuminate the processes through which information in the genome controls precise patterns of gene expression, and will help us interpret disease causing genetic variants that alter normal patterns of gene expression.
|Sherman, Marc S; Cohen, Barak A (2014) A computational framework for analyzing stochasticity in gene expression. PLoS Comput Biol 10:e1003596|
|Kwasnieski, Jamie C; Fiore, Christopher; Chaudhari, Hemangi G et al. (2014) High-throughput functional testing of ENCODE segmentation predictions. Genome Res 24:1595-602|
|Mogno, Ilaria; Kwasnieski, Jamie C; Cohen, Barak A (2013) Massively parallel synthetic promoter assays reveal the in vivo effects of binding site variants. Genome Res 23:1908-15|
|White, Michael A; Myers, Connie A; Corbo, Joseph C et al. (2013) Massively parallel in vivo enhancer assay reveals that highly local features determine the cis-regulatory function of ChIP-seq peaks. Proc Natl Acad Sci U S A 110:11952-7|
|Sherman, Marc S; Cohen, Barak A (2012) Thermodynamic state ensemble models of cis-regulation. PLoS Comput Biol 8:e1002407|