The aims of the ENCODE (Encyclopedia of DNA Elements) and modENCODE (model organism ENCODE) projects are to apply high-throughput, cost-efficient approaches to generate a catalog of functional elements in the human, worm, and fly genomes, which will serve as the basis for biomedical research advances. By their smaller genome size, powerful genetics, and ease of experimentation, D. melanogaster and C. elegans can help guide the study of functional elements in the human genome, reveal new insights into global gene regulation and embryo development, and enable experimental studies of gene function and regulation which are not accessible in mammalian systems. This proposal aims to enhance the value of these datasets by creating a Data Analysis Center (DAC) to support, facilitate, and enhance integrative analyses of the modENCODE consortium in fly and worm, to achieve a high-resolution annotation of all their functional elements, and to reveal new insights into the biology and gene regulation of animal genomes including the human. We foresee four central roles for the DAC, and have organized our aims around them.
Aim 1 : We will provide common computational guidelines for data processing in fly and worm, a common computational infrastructure and pipeline for common analysis and statistical tasks.
Aim 2 : We will facilitate and carry out element-specific integrative analyses to identify diverse classes of functional elements based on combinations of relevant datasets coming from multiple groups. This includes (a) enhancers, promoters, insulators, and other regions of regulatory importance, (b) protein-coding and non-coding genes, (c) regulatory networks of transcription factor and microRNA targeting, and (d) sequence features predictive of diverse classes of functional elements.
Aim 3 : We will carry out exploratory data analyses across different data types to discover potentially novel correlations and insights relating diverse classes of elements. In particular we will apply dimensionality reduction techniques to coordinate-based genome-wide genomic and epigenomic datasets, we will apply clustering and bi-clustering methods to identify functionally related sets of genes and modules, and we will analyze structural and dynamic properties of discovered networks.
Aim 4 : We will carry out comparative analyses across the two model organisms, and also with yeast and human. We will provide an ortholog resource between the species, compare regulatory relationships and dynamics for orthologous cell lines and developmental points, and carry over biological knowledge across model organisms and human. To achieve these four aims, we will work closely with members of the consortium, the modENCODE Analysis Working Group (AWG), consisting of all Principal Investigators and analysis groups, and the Data Coordination Center (DCC), responsible for all data sharing within the consortium and with the larger worm and fly communities.

Public Health Relevance

The aims of the ENCODE (Encyclopedia of DNA Elements) and modENCODE (model organism ENCODE) projects are to apply high-throughput, cost-efficient approaches to generate a catalog of functional elements in the human, worm, and fly genomes, which will serve as the basis for biomedical research advances. By their smaller genome size, powerful genetics, and ease of experimentation, D. melanogaster and C. elegans can help guide the study of functional elements in the human genome, reveal new insights into global gene regulation and embryo development, and enable experimental studies of gene function and regulation which are not accessible in mammalian systems. This proposal aims to enhance the value of these datasets by creating a Data Analysis Center (DAC) to support, facilitate, and enhance integrative analyses of the modENCODE consortium in fly and worm, to achieve a high-resolution annotation of all their functional elements, and to reveal new insights into the biology and gene regulation of animal genomes including the human.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
High Impact Research and Research Infrastructure Programs (RC2)
Project #
5RC2HG005639-02
Application #
7943875
Study Section
Special Emphasis Panel (ZHG1-HGR-M (O1))
Program Officer
Feingold, Elise A
Project Start
2009-09-30
Project End
2013-08-31
Budget Start
2010-09-01
Budget End
2013-08-31
Support Year
2
Fiscal Year
2010
Total Cost
$1,316,360
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
Organized Research Units
DUNS #
001425594
City
Cambridge
State
MA
Country
United States
Zip Code
02139
Bekelis, Kimon; Kerley-Hamilton, Joanna S; Teegarden, Amy et al. (2016) MicroRNA and gene expression changes in unruptured human cerebral aneurysms. J Neurosurg 125:1390-1399
Gan, Hin Hark; Gunsalus, Kristin C (2015) Assembly and analysis of eukaryotic Argonaute-RNA complexes in microRNA-target recognition. Nucleic Acids Res 43:9613-25
Bansal, Mukul S; Wu, Yi-Chieh; Alm, Eric J et al. (2015) Improved gene tree error correction in the presence of horizontal gene transfer. Bioinformatics 31:1211-8
Li, Jingyi Jessica; Huang, Haiyan; Bickel, Peter J et al. (2014) Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data. Genome Res 24:1086-101
Wen, Jiayu; Mohammed, Jaaved; Bortolamiol-Becet, Diane et al. (2014) Diversity of miRNAs, siRNAs, and piRNAs across 25 Drosophila cell lines. Genome Res 24:1236-50
Gerstein, Mark B; Rozowsky, Joel; Yan, Koon-Kiu et al. (2014) Comparative analysis of the transcriptome across distant species. Nature 512:445-8
Brown, James B; Boley, Nathan; Eisman, Robert et al. (2014) Diversity and dynamics of the Drosophila transcriptome. Nature 512:393-9
Miura, Pedro; Shenker, Sol; Andreu-Agullo, Celia et al. (2013) Widespread and extensive lengthening of 3' UTRs in the mammalian brain. Genome Res 23:812-25
Gan, Hin Hark; Gunsalus, Kristin C (2013) Tertiary structure-based analysis of microRNA-target interactions. RNA 19:539-51
Mohammed, Jaaved; Flynt, Alex S; Siepel, Adam et al. (2013) The impact of age, biogenesis, and genomic clustering on Drosophila microRNA evolution. RNA 19:1295-308

Showing the most recent 10 out of 19 publications