dictyBase is the model organism database (MOD) for the eukaryote Dictyostelium discoideum and related species. A community resource, widely supported by the research community, dictyBase contains gold standard expert literature curation of genes, functional annotations using the Gene Ontology and a wide range of genomic resources. Dictyostelium is widely used to study cellular processes such as cell motility, chemotaxis, signal transduction, cellular response to drugs, and host-pathogen interactions. Dictyostelium's genome contains significant orthologs of vertebrate, yeast and microbial genes, attracting researchers interested in a wide variety of biological topics including human disease, multicellular differentiation and comparative genomics. dictyBase enables researchers to search, view and download up-to-date genomic, functional and technical information. It is also widely used by teachers/instructors due to the wealth of available teaching materials and research protocols. Dictyostelium investigators depend on dictyBase as their primary community resource, where help from dictyBase staff (dictyBase help line) or from other users (Dicty ListServ, moderated by dictyBase) is available. We are in the final stages of deploying our completely new technology stack. By the end of this year dictyBase will be run entirely as a cloud-based application. This propoal seeks support to continue operating and expanding this important community resource. Our goals for this proposal are:
(Aim 1) To continue (a) expert curation by dictyBase curators and enable (b) Community curation leveraging our strong relationship with the community. We will use additional sequence data to (c) update the AX4 reference genome sequence and improve the efficiency of curation by using (d) Deep learning-based linking of papers to genes prioritizing them for further analysis and curation.
(Aim 2) We will improve dictyBase utility and usability by implementing (a) Bulk annotation methods for importing large-scale data sets using both (i) a web interface and (ii) a script/command line method. (b) We will add 10 additional Dictyostelid genomes using automated methods to annotate them. We will improve usability by implementing a (c) concurrent blast search with a new user interface and integrate this with the JBrowse display.
(Aim 3) To expand the data and increase the richness of annotations available in dictyBase we will implement mechanisms to capture, store and display: (a) additional context to GO annotations (i) using existing GO extensions and (ii) annotating and displaying biological pathways using GO CAM models; (b) integrate and display genome wide insertion mutant information for over 20 thousand insertional mutants; and (c) develop a graphical display of spatial expression data using Dictyostelium anatomy ontology terms (i) by adding a track in JBrowse for genes annotated with spatial / anatomy expression terms, and (ii) creating a graphical display of these annotations via our Circos-based dashboard tool. As other data sets become available we will add them to dictyBase and develop methods to display the data and make it searchable.

Public Health Relevance

dictyBase is the model organism database (MOD) for the eukaryote Dictyostelium discoideum and related species, Dictyostelium is widely used for research in the biomedical, genetic, and environmental domains. The database uses the genome of Dictyostelium to organize biological knowledge developed using this experimental system, and dictyBase is manually curated and up-to-date with current literature. This application proposes capturing new types of data and providing tools to search and visualize that data.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
2R01GM064426-17
Application #
9738586
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Maas, Stefan
Project Start
2002-08-01
Project End
2021-08-31
Budget Start
2019-09-15
Budget End
2020-08-31
Support Year
17
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Northwestern University at Chicago
Department
Anatomy/Cell Biology
Type
Schools of Medicine
DUNS #
005436803
City
Chicago
State
IL
Country
United States
Zip Code
60611
Basu, Siddhartha; Fey, Petra; Jimenez-Morales, David et al. (2015) dictyBase 2015: Expanding data and annotations in a new software environment. Genesis 53:523-534
Basu, Siddhartha; Fey, Petra; Pandit, Yogesh et al. (2013) DictyBase 2013: integrating multiple Dictyostelid species. Nucleic Acids Res 41:D676-83
Fey, Petra; Dodson, Robert J; Basu, Siddhartha et al. (2013) One stop shop for everything Dictyostelium: dictyBase and the Dicty Stock Center in 2012. Methods Mol Biol 983:59-92
Van Auken, Kimberly; Fey, Petra; Berardini, Tanya Z et al. (2012) Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR. Database (Oxford) 2012:bas040
Sucgang, Richard; Kuo, Alan; Tian, Xiangjun et al. (2011) Comparative genomics of the social amoebae Dictyostelium discoideum and Dictyostelium purpureum. Genome Biol 12:R20
Gaudet, Pascale; Bairoch, Amos; Field, Dawn et al. (2011) Towards BioDBcore: a community-defined information specification for biological databases. Nucleic Acids Res 39:D7-10
Yu, Bing; Fey, Petra; Kestin-Pilcher, Karen E et al. (2011) Spliceosomal genes in the D. discoideum genome: a comparison with those in H. sapiens, D. melanogaster, A. thaliana and S. cerevisiae. Protein Cell 2:395-409
Gaudet, Pascale; Bairoch, Amos; Field, Dawn et al. (2011) Towards BioDBcore: a community-defined information specification for biological databases. Database (Oxford) 2011:baq027
Gaudet, Pascale; Fey, Petra; Basu, Siddhartha et al. (2011) dictyBase update 2011: web 2.0 functionality and the initial steps towards a genome portal for the Amoebozoa. Nucleic Acids Res 39:D620-4
Reference Genome Group of the Gene Ontology Consortium (2009) The Gene Ontology's Reference Genome Project: a unified framework for functional annotation across species. PLoS Comput Biol 5:e1000431

Showing the most recent 10 out of 19 publications