Our primary goal is to comprehensively identify the functional elements of the Drosophila transcriptome, including the poly(A)- transcripts, all translated regions, and all RNA elements bound by proteins. In addition, we propose to elucidate the RNA sequences and binding proteins that regulate splicing, translation and stability of RNA. We plan to survey representative time points throughout development, a wide-variety of tissue types and well-characterized cell lines. We will produce a catalog of poly(A)- transcripts with single-base resolution and validate a subset of transcription start sites and termination sites using RLM-RACE. We will discover novel translated elements of mRNAs and long non-coding RNAs using ribosome profiling and validate a subset using a protein expression assay. We will comprehensively identify the RNAs bound by the complete set of RNA-binding proteins (RBPs) in Drosophila using affinity purification followed by high throughput sequencing (RIP-seq), prioritizing human homologs. RBPs recognize and bind to specific RNA sequences, and these interactions can regulate the stability, splicing, nuclear export, subcellular localizatin, and translatability of mRNAs. Concurrent with these studies, bioinformatic analyses will identify new poly(A)- transcripts, model their secondary structures, discover novel translated elements of mRNAs such as uORFs and identify long non-coding RNAs. For each RNA-binding protein (RBPs), we will identify bound transcripts using statistical measures of reproducibility, biologica replicates and relative enrichment. We also plan to identify binding-site motifs (PWMs) and develop the network of RNA-protein interactions. We will integrate the RNA-protein interaction map with the published Drosophila Protein Interaction Map. The scope of these studies is unprecedented and will provide the most comprehensive set of experimental evidence for post-transcriptional gene regulation in any organism. As a public resource, these studies are a prerequisite for understanding normal metabolism, growth and differentiation that will aid in understanding these processes in other organisms and in human health and disease.

Public Health Relevance

Drosophila models have been developed for many human diseases including neuro-degenerative diseases and cancers. The advanced genetic tools needed to make mechanistic insights are well established, and the simpler genome encoding highly conserved genes and gene networks allows more rapid progress than is possible in mammalian systems. A complete catalog of functional RNA elements will undoubtedly help to reveal mechanisms of disease and will therefore be a major benefit to society.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1-HGR-M (M1))
Program Officer
Feingold, Elise A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Lawrence Berkeley National Laboratory
Organized Research Units
United States
Zip Code
Duff, Michael O; Olson, Sara; Wei, Xintao et al. (2015) Genome-wide identification of zero nucleotide recursive splicing in Drosophila. Nature 521:376-9
Brown, James B; Celniker, Susan E (2015) Lessons from modENCODE. Annu Rev Genomics Hum Genet 16:31-53
Westholm, Jakub O; Miura, Pedro; Olson, Sara et al. (2014) Genome-wide analysis of drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation. Cell Rep 9:1966-80
Gerstein, Mark B; Rozowsky, Joel; Yan, Koon-Kiu et al. (2014) Comparative analysis of the transcriptome across distant species. Nature 512:445-8
Plocik, Alex M; Graveley, Brenton R (2013) New insights from existing sequence data: generating breakthroughs without a pipette. Mol Cell 49:605-17
Braunschweig, Ulrich; Gueroussov, Serge; Plocik, Alex M et al. (2013) Dynamic integration of splicing within gene regulatory pathways. Cell 152:1252-69