FlyBase is a bioinformatics resource to capture, organize and present core information on Drosophila (fly) genomics and genetics, both from literature and large-scale data generation projects. FlyBase provides an openly accessible centralized resource for Drosophila genetic and genomic data to enable researchers and educators worldwide both in the Drosophila community and broader biomedical sciences community to further their research. Drosophila is one of the premier model organisms and provides cost-effective help in elucidating the basic tenets of genetic and developmental mechanisms. FlyBase has three main goals. First, it curates literature and reagents relevant to Drosophila research, so that researchers can continue to rely on FlyBase to find the latest innovations in the field. FlyBase prioritizes curation of data sets relevant to gene expression, cellular functions, signaling pathways, and results relevant to human biology, and displays the information in an intuitive, integrated, readily searchable format. Secondly, it strives to be of value to the broader genetics and population genetics communities, by curating and integrating relevant data sets, and developing tools that enable better access to this wealth of data. Finally, it seeks to develop and integrate tools to expand it's utility. FlyBase has a long history of very successfully serving the scientific community in this capacity. FlyBase works closely with other Model Organism Databases (MODs) to integrate data sets and develop tools to enable cross-species analyses. NSF funding allows FlyBase to maintain its three main activities.

Activities essential to the operation of FlyBase include: Automated triaging pipeline; Full genetic curation with emphasis on genome feature curation and physical interaction curation; Human disease model curation; Allele-based curation of disease models based on the Disease Ontology; Gene Group curation; Curation of datasets deemed to be of highest general interest to the FlyBase user community; Import of graphical abstracts in FlyBase references; Identification of new lncRNAs, anti-sense lncRNAs and smORFs; Incorporation of available transcription start site data into FlyBase; Addition of new anatomy terms; Review and improvement of phenotypic class ontologies; Development of database modules for gene groups and human disease models; Annotation of all Drosophila cell types and curation of scRNAseq data sets; Update of the genomic sequences of all Drosophila Genetic Reference Panel (DGRP) strains; Expansion of a Pathway page resource. To facilitate integrative analyses and approaches FlyBase continues to expand its utility as a platform by integrating and displaying large-scale studies, transcriptomics and proteomics data sets. In addition, FlyBase improves access and display of tools available within the community, and incorporates the most useful data sets and tools for visualizing complex data sets to enable more researchers to take a more global approach to their genetic research. NSF funding allows for continued coordination with the ongoing operation and maintenance of the existing infrastructure to enable discovery in the biological sciences by supporting the computational database pipeline infrastructure of FlyBase data. FlyBase will maintain and further develop the central database that houses all FlyBase data; manage the incorporation of additional curated or large-scale data into the database; produce weekly internal reports and bi-monthly outputs that are transformed into the updates to the public FlyBase servers; work closely with NCBI to incorporate genome annotation-related data into FlyBase and to submit periodic whole genome annotation updates to GenBank. Collectively, these activities maximize the impact of essential Drosophila research on discovery and translation. Through their role in carrying out these missions, FlyBase will continue to help Drosophila research advance as quickly as possible for the ultimate benefit of various scientific communities via http://flybase.org

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
2039324
Program Officer
Steven Ellis
Project Start
Project End
Budget Start
2021-04-01
Budget End
2025-03-31
Support Year
Fiscal Year
2020
Total Cost
$324,705
Indirect Cost
Name
Harvard University
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138