Cis-acting regulatory elements control gene expression and are involved in all aspects of development, behavior and physiology;but no cis-regulatory element map yet exists for any metazoan genome. We therefore propose to identify genome-wide cis-regulatory elements in C. elegans. Among existing model organisms, C. elegans offers a strong combination of biological properties, transgenic technology, comparisons to genomes in four sibling species, and critical computational and bioinformatic infrastructure. We intend to find genomic regulatory elements in genes with widely varying expression patterns, gene functions, and cis-element content that drive expression in diverse developmental stages, cell types and physiological conditions. A pipeline of genomic predictions followed by efficient transgenic reporter assays will allow us to generate and analyze 10 DNA constructs each week. In year 1 we plan to identify hundreds of regulatory elements, with higher numbers in following years as we become better at predicting functional elements. Predicted elements will be assigned statistical scores based on the quality of their supporting computational and experimental data. We will also use chromatin immunoprecipitation analyzed by intense sequencing (ChIP-seq) to find regulatory modules, and compare its accuracy in finding functional sequences to that of predictions based on comparative genomics. The first round of our results from direct tests of predicted elements and ChIP-seq will be combined with external data from modENCODE to improve our predictive algorithms for later cycles of genome-wide element prediction. Our data will be released promptly to WormBase, and all our computational tools are freely available with full source code.
Regulatory DNA sequences that control the time, place, and level of transcription are crucial for normal development, behavior and physiology as well as disease;yet there is no genome-wide map of them for any animal genome, nor are researchers currently able to predict their functional output from their DNA sequences. We will attempt to solve this problem by extensive, reiterated experimental tests of computational predictions in a simple animal genome.