The ability to control the activity level of different genes is key to fundamental biological processes such as development and differentiation, and many human diseases are caused by defects in this regulatory process. This regulation is encoded within specific regions of the genome, termed regulatory regions, and indeed, in many studies of cancer and of other diseases and human phenotypes, changes in gene activity that are tightly linked to the disease state have in turn been linked to changes in the DNA sequence of the genes'regulatory regions. However, we currently have a poor understanding of the how gene activity is encoded by DNA sequence, and thus, we do not understand by what mechanism these disease-linked sequence changes cause the observed changes in gene activities. Given the many studies of gene regulation that have been carried out, it is actually surprising how little we know about this mapping between gene activity and DNA sequence. In principle, such questions can be directly answered through accurate measurements of regulatory regions in which various sequence elements are varied systematically. However, such data does not currently exist, most likely due to the technical difficulties in constructing such sequences and accurately measuring their activity. Here, we aim to derive a mechanistic understanding of how gene activity patterns are encoded in DNA sequence, and arrive at a quantitative model that describes the entire process, from the activity of the regulating proteins, termed transcription factors, to their binding to regulatory regions, through the important role of DNA packaging in this process, and up to the gene activity patterns resulting from the DNA binding activity of the regulating transcription factors. A systematic study of such interactions requires the ability to efficiently synthesize and accurately measure the activity of many different regulatory sequences. We have recently developed such capabilities, which we will utilize in this project. Specifically, we will design regulatory sequences that systematically test the quantitative contribution of various types of sequence elements to gene activity, measure their activity, integrate the resulting data into a unified model of gene regulation, and then use this model to examine how such regulatory sequence elements are used in native promoters to achieve biologically meaningful activity patterns, and how changes in these sequence elements during evolution contribute to evolutionary changes in gene activity. Finally, we will apply the model to predict gene activity changes among human individuals, using the emerging genotype data that is rapidly being collected. If successful, our project should have far reaching implications. Most notably, since changes in gene activity levels play a key role in the development of cancer and of many other diseases, even a partial ability to predict gene activity changes among human individuals from the genotype information that is rapidly being collected for them, could have important medical implications.

Public Health Relevance

The ability to control the activity level of different genes is key to fundamental biological processes such as development and differentiation, and many human diseases are caused by defects in this regulatory process. This proposal aims to unravel the rules by which this control is encoded in the language of DNA sequence, and to arrive at a quantitative model that can be used to predict changes in gene activity levels across human individuals, based on the emerging genotype data that is rapidly being collected for them. Since changes in gene activity levels play a key role in the development of cancer and of many other diseases, such a predictive ability could have important medical implications.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project (R01)
Project #
4R01CA119176-08
Application #
8518233
Study Section
Special Emphasis Panel (ZRG1-GGG-F (02))
Program Officer
Li, Jerry
Project Start
2006-04-21
Project End
2016-07-31
Budget Start
2013-08-01
Budget End
2014-07-31
Support Year
8
Fiscal Year
2013
Total Cost
$445,679
Indirect Cost
$12,832
Name
Weizmann Institute of Science
Department
Type
DUNS #
600048466
City
Rehovot, Israel
State
Country
Israel
Zip Code
76100
Koch, Christopher; Konieczka, Jay; Delorey, Toni et al. (2017) Inference and Evolutionary Analysis of Genome-Scale Regulatory Networks in Large Phylogenies. Cell Syst 4:543-558.e8
Knaack, Sara A; Thompson, Dawn A; Roy, Sushmita (2016) Reconstruction and Analysis of the Evolution of Modular Transcriptional Regulatory Programs Using Arboretum. Methods Mol Biol 1361:375-89
Bao, Xiaoyan Robert; Ong, Shao-En; Goldberger, Olga et al. (2016) Mitochondrial dysfunction remodels one-carbon metabolism in human cells. Elife 5:
Thompson, Dawn A (2016) Comparative Transcriptomics in Yeasts. Methods Mol Biol 1361:67-76
Manor, Ohad; Segal, Eran (2015) GenoExp: a web tool for predicting gene expression levels from single nucleotide polymorphisms. Bioinformatics 31:1848-50
Thompson, Dawn; Regev, Aviv; Roy, Sushmita (2015) Comparative analysis of gene regulatory networks: from network reconstruction to evolution. Annu Rev Cell Dev Biol 31:399-428
Ford, Christopher B; Funt, Jason M; Abbey, Darren et al. (2015) The evolution of drug resistance in clinical isolates of Candida albicans. Elife 4:e00662
Abbey, Darren A; Funt, Jason; Lurie-Weinberger, Mor N et al. (2014) YMAP: a pipeline for visualization of copy number variation and loss of heterozygosity in eukaryotic pathogens. Genome Med 6:100
Schwartz, Schraga; Bernstein, Douglas A; Mumbach, Maxwell R et al. (2014) Transcriptome-wide mapping reveals widespread dynamic-regulated pseudouridylation of ncRNA and mRNA. Cell 159:148-162
Weingarten-Gabbay, Shira; Segal, Eran (2014) The grammar of transcriptional regulation. Hum Genet 133:701-11

Showing the most recent 10 out of 40 publications