The ability to control the activity level of different genes is key to fundamental biological processes such as development and differentiation, and many human diseases are caused by defects in this regulatory process. This regulation is encoded within specific regions of the genome, termed regulatory regions, and indeed, in many studies of cancer and of other diseases and human phenotypes, changes in gene activity that are tightly linked to the disease state have in turn been linked to changes in the DNA sequence of the genes'regulatory regions. However, we currently have a poor understanding of the how gene activity is encoded by DNA sequence, and thus, we do not understand by what mechanism these disease-linked sequence changes cause the observed changes in gene activities. Given the many studies of gene regulation that have been carried out, it is actually surprising how little we know about this mapping between gene activity and DNA sequence. In principle, such questions can be directly answered through accurate measurements of regulatory regions in which various sequence elements are varied systematically. However, such data does not currently exist, most likely due to the technical difficulties in constructing such sequences and accurately measuring their activity. Here, we aim to derive a mechanistic understanding of how gene activity patterns are encoded in DNA sequence, and arrive at a quantitative model that describes the entire process, from the activity of the regulating proteins, termed transcription factors, to their binding to regulatory regions, through the important role of DNA packaging in this process, and up to the gene activity patterns resulting from the DNA binding activity of the regulating transcription factors. A systematic study of such interactions requires the ability to efficiently synthesize and accurately measure the activity of many different regulatory sequences. We have recently developed such capabilities, which we will utilize in this project. Specifically, we will design regulatory sequences that systematically test the quantitative contribution of various types of sequence elements to gene activity, measure their activity, integrate the resulting data into a unified model of gene regulation, and then use this model to examine how such regulatory sequence elements are used in native promoters to achieve biologically meaningful activity patterns, and how changes in these sequence elements during evolution contribute to evolutionary changes in gene activity. Finally, we will apply the model to predict gene activity changes among human individuals, using the emerging genotype data that is rapidly being collected. If successful, our project should have far reaching implications. Most notably, since changes in gene activity levels play a key role in the development of cancer and of many other diseases, even a partial ability to predict gene activity changes among human individuals from the genotype information that is rapidly being collected for them, could have important medical implications.
The ability to control the activity level of different genes is key to fundamental biological processes such as development and differentiation, and many human diseases are caused by defects in this regulatory process. This proposal aims to unravel the rules by which this control is encoded in the language of DNA sequence, and to arrive at a quantitative model that can be used to predict changes in gene activity levels across human individuals, based on the emerging genotype data that is rapidly being collected for them. Since changes in gene activity levels play a key role in the development of cancer and of many other diseases, such a predictive ability could have important medical implications.
|Weingarten-Gabbay, Shira; Segal, Eran (2014) The grammar of transcriptional regulation. Hum Genet 133:701-11|
|Roy, Sushmita; Wapinski, Ilan; Pfiffner, Jenna et al. (2013) Arboretum: reconstruction and analysis of the evolutionary history of condition-specific transcriptional modules. Genome Res 23:1039-50|
|Gat-Viks, Irit; Chevrier, Nicolas; Wilentzik, Roni et al. (2013) Deciphering molecular circuits from genetic variation underlying transcriptional responsiveness to stimuli. Nat Biotechnol 31:342-9|
|Tsankov, Alexander M; Thompson, Dawn Anne; Socha, Amanda et al. (2010) The role of nucleosome positioning in the evolution of gene regulation. PLoS Biol 8:e1000414|
|Kaplan, Noam; Moore, Irene; Fondufe-Mittendorf, Yvonne et al. (2010) Nucleosome sequence preferences influence in vivo nucleosome organization. Nat Struct Mol Biol 17:918-20|
|Tsai, Miao-Chih; Manor, Ohad; Wan, Yue et al. (2010) Long noncoding RNA as modular scaffold of histone modification complexes. Science 329:689-93|
|Lidor Nili, Efrat; Field, Yair; Lubling, Yaniv et al. (2010) p53 binds preferentially to genomic regions with high DNA-encoded nucleosome occupancy. Genome Res 20:1361-8|
|Huarte, Maite; Guttman, Mitchell; Feldser, David et al. (2010) A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell 142:409-19|
|Jaimovich, Ariel; Rinott, Ruty; Schuldiner, Maya et al. (2010) Modularity and directionality in genetic interaction maps. Bioinformatics 26:i228-36|
|Levin, Joshua Z; Yassour, Moran; Adiconis, Xian et al. (2010) Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat Methods 7:709-15|
Showing the most recent 10 out of 16 publications