The goal of this proposal is to discover and interpret the code by which cis-regulatory DNA controls gene expression. Although cis-regulatory logic is reasonably well understood in bacteria and yeast, this is not the case in multicellular organisms. With a very large number of types of differentiated cells, each of which have the same genetic material but express different sets of genes, metazoans such as humans devote large regions of DNA to biologically essential regulatory functions. These regulatory regions are targets of selection and evolutionary change, and noncoding polymorphisms are connected to disease susceptibility. We have developed a new approach to understanding cis-regulatory logic based on understanding the principles that govern which configurations of bound transcription factors activate transcription and which configurations repress it. This approach is implemented in a model trained on quantitative expression data at cellular resolution from blastoderm stage embryos of Drosophila melanogaster, which we use as a naturally grown gene chip. The model is able to correctly predict expression from DNA not used in the training procedure, including highly diverged sequence from distantly related species. We will interpret the cis-regulatory code by making use of a suite of tools applied to D. melanogaster and its sibling species D. erecta and D. virilis. The consideration of regulatory circuits across species at the resolution proposed represents a profound extension of network analysis. Supporting techniques include targeted chromosomal transformation of Drosophilid embryos, a sequence-based model of transcriptional control having an established predictive capability for a range of problems, whole-locus transgenes engineered at single-nucleotide resolution with recombineering, and methods for designing and testing synthetic enhancers. The forgoing methods will allow us to test proposed principles of cis-regulatory logic as they are developed in the context of naturally occurring and artificial sequences, and in perturbed trans-environments. Our ultimate goal is to predict the expression patterns of whole genes and synthetic enhancers directly from genomic sequence and data on transcription factor expression. These objectives are summarized the following four specific aims. 1) Design, synthesize, and experimentally test completely defined artificial enhancers that express naturally occurring or arbitrarily chosen patterns on the anterior-posterior axis. 2) Construct and experimentally test a model of the embryonic expression of the complete even-skipped locus. 3) Build a quantitative map of maternal gradients and gap gene expression in Drosophila virilis and Drosophila erecta. 4) Construct testable models of the maternal-gap-eve networks in Drosophila virilis and Drosophila erecta.

Public Health Relevance

Although the function of that portion of DNA sequence that codes for protein is understood, the function of the part that determines how DNA turns genes on and off remains to be elucidated. The goal of this project is to understand how DNA sequence controls gene expression using the fruit fly as an experimental system. The basic science developed in this project will have long term medical applications because cancer and many birth defects result from genes being turned on and off incorrectly.

National Institute of Health (NIH)
Office of The Director, National Institutes of Health (OD)
Research Project (R01)
Project #
Application #
Study Section
Modeling and Analysis of Biological Systems Study Section (MABS)
Program Officer
Zou, Sige
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Chicago
Schools of Medicine
United States
Zip Code
Barr, Kenneth A; Martinez, Carlos; Moran, Jennifer R et al. (2017) Synthetic enhancer design by in silico compensatory evolution reveals flexibility and constraint in cis-regulation. BMC Syst Biol 11:116
Hope, C Matthew; Rebay, Ilaria; Reinitz, John (2017) DNA Occupancy of Polymerizing Transcription Factors: A Chemical Model of the ETS Family Factor Yan. Biophys J 112:180-192
Barr, Kenneth A; Reinitz, John (2017) A sequence level model of an intact locus predicts the location and function of nonadditive enhancers. PLoS One 12:e0180861
Vakulenko, Sergei; Radulescu, Ovidiu; Morozov, Ivan et al. (2017) Centralized Networks to Generate Human Body Motions. Sensors (Basel) 17:
Lou, Zhihao; Reinitz, John (2016) Parallel Simulated Annealing Using an Adaptive Resampling Interval. Parallel Comput 53:23-31
Kozlov, Vladimir; Vakulenko, Sergey; Wennergren, Uno (2016) Hamiltonian dynamics for complex food webs. Phys Rev E 93:032413
Bertolino, Eric; Reinitz, John; Manu (2016) The analysis of novel distal Cebpa enhancers and silencers using a transcriptional model reveals the complex regulatory logic of hematopoietic lineage specification. Dev Biol 413:128-44
Jiang, Pengyao; Ludwig, Michael Z; Kreitman, Martin et al. (2015) Natural variation of the expression pattern of the segmentation gene even-skipped in melanogaster. Dev Biol 405:173-81
Ramos, Alexandre F; Hornos, José Eduardo M; Reinitz, John (2015) Gene regulation and noise reduction by coupling of stochastic processes. Phys Rev E Stat Nonlin Soft Matter Phys 91:020701
Grigoriev, D; Reinitz, J; Vakulenko, S et al. (2014) Punctuated evolution and robustness in morphogenesis. Biosystems 123:106-13

Showing the most recent 10 out of 18 publications