Computational &Functional Annotation of the Zebrafish Genome Regulatory Toolbox Zebrafish with its growing arsenal of tools that allow the generation of transgenics, gene knockdowns and knockouts, and mutant resources coupled with its high-throughput and cost efficiency is quickly becoming the major animal model for drug screens and gene related studies. However, as with other vertebrate genomes, the majority of the zebrafish genome (97%) is made up of non-genic sequences whose functional necessity remains largely unknown. One vital function that is clearly embedded in these regions is gene regulation, instructing genes when and where to turn on or off. However, unlike genes where we know their genomic location, their code, and the consequences of nucleotide changes within them, in gene regulatory sequences we don't have that knowledge. This knowledge is extremely vital, with a wide variety of clinical and molecular data supporting these sequences to be an important driver for development, evolution, diversity, and disease. In this proposal, we will combine advanced computational tools with high-throughput zebrafish functional studies to annotate this noncoding terrain. Using and refining multiple vertebrate genome alignments we have generated an unprecedented set of 166,693 zebrafish conserved noncoding elements (CNEs), with at least 8,805 regions having a direct ortholog in the human genome. Preliminary studies for a portion of these sequences using a zebrafish transgenic enhancer assay, find 41% of these sequences to function as enhancers at 24 to 48 hours post fertilization. Taking advantage of this transgenic assay we aim to screen 200 sequences a year for enhancer activity. These sequences will be selected from our large CNE set, sequences whose enhancer activity and tissue-timepoint specificity will be predicted using sophisticated computational tools, and community requested sequences. This characterization will not only allow the functional annotation of these sequences, but will also generate a novel and extremely important toolkit of gene regulatory elements that can drive expression of any gene of interest at precise locations and precise developmental time points. In addition, we will also use the annotated regulatory landscape to discover novel genes with potential important developmental function. This will be carried out by analyzing the expression patterns and functional consequences due to knockdown of less characterized genes that lie in rich regulatory regions, a common sign for the existence of important developmental gene regulators. Additional computational techniques will be used to discover genes under tight regulation in novel tissue contexts, as well as pathways which are currently not studied in the context we find them enriched in. All the data generated in this proposal, both computational and functional, will be made available to the community through a dedicated web browser ( as well as integration into ZFIN, Ensembl, and the UCSC genome browser. Combined, our work will advance zebrafish as the major animal model for annotating and characterizing the noncoding portion of the vertebrate genome.

Public Health Relevance

Computational &Functional Annotation of the Zebrafish Genome Regulatory Toolbox While genes make up less than 3% of our DNA, within the remaining 97% lie other numerous extremely important sequences such as gene regulatory elements, that instruct the genes when and where to turn on or off. Mutations in these gene regulatory elements can have a great impact on human disease, yet their location and code still remains on the majority unknown. In this proposal we will take advantage of the unique properties of the zebrafish model organism to couple advanced computational tools with rapid functional zebrafish assays to annotate these sequences and obtain a better understanding of the vertebrate gene regulatory code, which will be of extreme importance to our comprehension of the genetic cause for numerous human diseases.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BDA-F (50))
Program Officer
Feingold, Elise A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Anatomy/Cell Biology
Schools of Medicine
United States
Zip Code
Guturu, Harendra; Chinchali, Sandeep; Clarke, Shoa L et al. (2016) Erosion of Conserved Binding Sites in Personal Genomes Points to Medical Histories. PLoS Comput Biol 12:e1004711
Yang, Song; Oksenberg, Nir; Takayama, Sachiko et al. (2015) Functionally conserved enhancers with divergent sequences in distant vertebrates. BMC Genomics 16:882
Oksenberg, N; Haliburton, G D E; Eckalbar, W L et al. (2014) Genome-wide distribution of Auts2 binding localizes with active neurodevelopmental genes. Transl Psychiatry 4:e431
VanderMeer, Julia E; Lozano, Reymundo; Sun, Miao et al. (2014) A novel ZRS mutation leads to preaxial polydactyly type 2 in a heterozygous form and Werner mesomelic syndrome in a homozygous form. Hum Mutat 35:945-8
Birnbaum, Ramon Y; Patwardhan, Rupali P; Kim, Mee J et al. (2014) Systematic dissection of coding exons at single nucleotide resolution supports an additional role in cell-specific transcriptional regulation. PLoS Genet 10:e1004592
VanderMeer, Julia E; Smith, Robin P; Jones, Stacy L et al. (2014) Genome-wide identification of signaling center enhancers in the developing limb. Development 141:4194-8
Smith, Robin P; Eckalbar, Walter L; Morrissey, Kari M et al. (2014) Genome-wide discovery of drug-dependent human liver regulatory elements. PLoS Genet 10:e1004648
Zhang, Yubo; Wong, Chee-Hong; Birnbaum, Ramon Y et al. (2013) Chromatin connectivity maps reveal dynamic promoter-enhancer long-range associations. Nature 504:306-10
Hiller, Michael; Agarwal, Saatvik; Notwell, James H et al. (2013) Computational methods to detect conserved non-genic elements in phylogenetically isolated genomes: application to zebrafish. Nucleic Acids Res 41:e151
Zhao, Jingjing; Shi, Hongbo; Ahituv, Nadav (2013) Classification of topological domains based on gene expression and regulation. Genome 56:415-23

Showing the most recent 10 out of 28 publications