Brown University is awarded a grant to develop algorithms and software tools for building genomic maps of the developmental regulatory circuitry of cells. The focus is on two fundamental problems of great practical importance. The first is to build the cisGRN Database of genomic regulatory information behind gene networks: capturing the complex nature of the cis-regulatory evidence from many kinds of regulatory information across technologies. The database will contain the following components: genomic structure and organization, comparative genomics, cis-regulatory analysis of expressions constructs and mutational analysis, spatial expression profiles, and logic functions of the genomic cis-regulatory code. The second aim is to build Regulatory Genomics Logic Map Browser a next-generation genome browser incorporating model-building, annotation, and visualization capabilities for gene regulatory systems and networks. It will present genomic structural views, spatial expression views, expression constructs views, and information processing views, obtained through mathematical analysis of the logical principles of genomic regulatory systems and networks. The work will be carried out in collaboration with the California Institute of Technology.
The main focus of the scientific method is causality. For cis-regulatory molecular biology, the application area of this grant, the quintessential iterative trio of the scientific method -- description-prediction-testing -- aims to discover "omic" integrated causality-based genetic mechanisms. These computational models are descriptive representations of the genomic and cis-regulatory state and predictive in the sense that conjectures formulated within the above descriptive terms are precise predictions of regulatory input-transcriptional output behavior. The essential property is that these predictions are testable predictions, i.e., they are falsifiable within present experimental capabilities (in many molecular biology labs). Einstein highlighted the two fundamental methodological breakthroughs needed "Development of Western science is based on two great achievements: the invention of the formal logical system (in Euclidean geometry) by the Greek philosophers, and the discovery of the possibility to find out causal relationships by systematic experiment (during Renaissance)." The work funded by this proposal was done in collaboration with Eric Davidson of California Institute of Technology, the leading experimental biologist in the field. This grant helped us deliver the following: the CYRENE cisGRN-Lexicon Database, the database of causality-inferred genomic cis-regulatory evidence for gene regulatory networks (GRN). It contains the regulatory architecture of 423 transcription-factor-encoding genes and 194 other regulatory genes in eight species: human, mouse, fruit fly, sea urchin, nematode, rat, chicken, and zebrafish, with a higher priority on the first five. The only target genes included in the cisGRN-Lexicon, the CYRENE genes, are those whose regulatory architecture was validated by the Davidson Criterion – the gold standard of experimental validation procedures. Content for the cisGRN-Lexicon database was created by a small army of Brown student annotators over the last four years; the CYRENE cisGRN logic map Browser, a genome browser software system devoted to genomic cis-regulatory systems, with a gene network editor bridge integrated into the Davidson Lab’s BioTapestry system. It builds on and expands the Celera Genomics Genome Browser software code (open source); CLOSE (cis-Lexicon Ontology Search Engine), a software system incorporating a set of algorithmic strategies for automated literature extraction of genomic cis-regulation articles; a supporting suite of online tools: cis-Browser Lite, a version of the CYRENE database; seqFinder, used to help researchers find and visualize cis-regulatory modules within larger DNA sequences; Cedar, a Java-based cisGRN-Lexicon error detection and retrieval toolkit. In addition to the above, other contributions, in prototype stages, were achieved: the BEAR baby-Browsers data structure, a reengineering of the CYRENE cisGRN-Browser software with quickly transferable capabilities to other omes-specific browsers (e.g. ARIADNE Haplotype-Browser); the Virtual Sea Urchin prototype for cell-specific dynamic time-series visualization of gene expression; The cisGRN Browser provides a new teaching and research environment that supports both dry and wet labs. Soon, students in the PI’s "Algorithmic Foundations of Computational Biology" course will work on their personalized cisGRN Browsers studying the regulatory architectures of the transcription factors encoding genes universe. The cisGRN Browser plus the cisGRN Lexicon are just a few components of the CELLARIUM, a broader project under development by the PI to create the classroom of the future for teaching computational biology. It is noteworthy that this NSF-funded research advanced the careers of 25 Brown undergraduate and graduate students (including 8 women) who served as annotators for the project. Of note, Ryan Tarpine, Ph.D. thesis, is now at Google Research; Kyle Schutter, honors thesis, is now founder and managing director of Takamoto Biogas in Kenya; Tim Johnstone, honors thesis, is now a graduate student in systems biology at Yale University; James Hart is a graduate student in developmental biology at U.C. Berkeley; Jake Franco is now a medical student at Stony Brook School of Medicine; David Moskovitz is now a graduate student in computational biology at Stanford; and Will Allen, Churchill Fellowship, is now a graduate student in systems biology at University of Cambridge. The cisGRN Browser was officially released to the scientific community at the Developmental Biology of the Sea Urchin Conference held in April 2011 at Marine Biological Laboratory in Woods Hole, Mass. Jongmin Nam, Ping Dong, Ryan Tarpine, Sorin Istrail and Eric H. Davidson published their paper about the browser in an article titled "Functional cis-regulatory genomics for systems biology" in Proceedings of the National Academy of Sciences (vol. 107, no. 8, pp. 3930-3935, 2010). Brown University’s Today at Brown published an article about the release in "A great leap forward in gene research" (http://today.brown.edu/articles/2010/03/tarpine). The NSF website linked to the article. Work funded by this proposal appeared in five published articles and in a book chapter, and has been cited in about 50 publications to date, including Nature, Computational Systems Biology, Developmental Biology, and Communications of the ACM.