One of the greatest challenges in animal biology is to learn how genomic sequence information is read by transcription factors to produce patterns of gene expression within the context of regulatory networks in developing embryos. This Project is part of a broader Program Project that will integrate computational modeling and wet laboratory methods to address this challenge in the belief that only quantitative, predictive mathematical models that have been validated experimentally can provide the rigorous understanding required for modeling transcriptional networks of animals. This Project's contribution to the overall Program will be to develop predictive computational models to understand the molecular mechanisms responsible for targeting transcription factors to DNA in vivo. In Preliminary Studies, we have shown that we can predict in vivo DNA binding with reasonable success using as input only in vitro DNA binding specificity and in vivo chromatin accessibility information and a simple generalized Hidden Markov Model (gHMM). The models are not sufficiently accurate, however. Therefore, we will develop more sophisticated Markov Random Field models that simultaneously consider all of the 32 principal regulatory transcription factors in the Drosophila blastoderm network, a number of ubiquitously expressed transcription factors, nucleosomes, and all of these proteins interactions with DNA and each other. We have divided our modeling into two Aims. The first will extend our current models for chromatin accessibility to determine what drives the genomic distribution of nucleosomes in the embryo, and whether and how nucleosome structure is spatially regulated across the embryo. The second seeks to identify and evaluate the molecular forces that target transcription factors to DNA in vivo. Our models will be tested, refined and validated in a series of transgenic experiments conducted in collaboration with Projects 1 and 2 and the Expression and Database Core. We expect that our models will uncover general principles governing the targeting transcription factors to DNA in vivo that will be applicable to all animal systems. By helping to establishing how to read transcriptional information in animal genomes, this Project will aid both the development of therapeutics for human genetic diseases and the understanding of animal development.

National Institute of Health (NIH)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Lawrence Berkeley National Laboratory
United States
Zip Code
Li, Jingyi Jessica; Bickel, Peter J; Biggin, Mark D (2014) System wide analyses have underestimated protein abundances and the importance of transcription in mammals. PeerJ 2:e270
Knowles, David W; Biggin, Mark D (2013) Building quantitative, three-dimensional atlases of gene expression and morphology at cellular resolution. Wiley Interdiscip Rev Dev Biol 2:767-79