Genome annotations combine sequence, the results of bioinformatics analyses, and the knowledge of human curators into models of gene structure. These annotations provide a basic resource for investigations into the genetic causes of human disease. Despite their potential as a resource for such studies, genome annotations have proven difficult to use. A major reason for this has been the lack of community standards for describing them, which has resulted in the proliferation of arbitrary file formats and database schemas. In order to solve this problem, the Gene Ontology Consortium has developed the Sequence Ontology (SO). The purpose of SO is unify the description of genome annotations. Many model organism databases such as SGD, WormBase and FlyBase have now adopted SO, and release their annotations in SO-compliant formats. Many other genome databases are attempting to follow suite, but are finding it difficult to do so. One reason for their difficulties is the lack of publicly available software for managing and distributing SO- compliant genome annotations. The goal of this proposal is to further develop, improve and consolidate existing software tools that will help the broader genomics community to use the Sequence Ontology as a tool to produce, manage, and disseminate SO-compliant genome annotations. Our proposed data adapters and converters will help bring old annotation data and software forward;our SO-based quality control pipelines will ensure that the data produced by different databases is indeed interoperable;and our navigation and database search tools will help human curators to produce higher quality SO-compliant genome annotations.
Desvignes, T; Batzel, P; Berezikov, E et al. (2015) miRNA Nomenclature: A View Incorporating Genetic Origins, Biosynthetic Pathways, and Sequence Variants. Trends Genet 31:613-626 |
Cunningham, Fiona; Moore, Barry; Ruiz-Schultz, Nicole et al. (2015) Improving the Sequence Ontology terminology for genomic variant annotation. J Biomed Semantics 6:32 |
Welch, Brandon M; Eilbeck, Karen; Del Fiol, Guilherme et al. (2014) Technical desiderata for the integration of genomic data with clinical decision support. J Biomed Inform 51:3-7 |
Welch, Brandon M; Loya, Salvador Rodriguez; Eilbeck, Karen et al. (2014) A proposed clinical decision support architecture capable of supporting whole genome sequence information. J Pers Med 4:176-99 |
Singleton, Marc V; Guthery, Stephen L; Voelkerding, Karl V et al. (2014) Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am J Hum Genet 94:599-610 |
Mungall, Christopher J; Batchelor, Colin; Eilbeck, Karen (2011) Evolution of the Sequence Ontology terms and relationships. J Biomed Inform 44:87-93 |
Reese, Martin G; Moore, Barry; Batchelor, Colin et al. (2010) A standard variation file format for human genome sequences. Genome Biol 11:R88 |
Gene Ontology Consortium (2010) The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res 38:D331-5 |
Moore, Barry; Fan, Guozhen; Eilbeck, Karen (2010) SOBA: sequence ontology bioinformatics analysis. Nucleic Acids Res 38:W161-4 |
Eilbeck, Karen; Moore, Barry; Holt, Carson et al. (2009) Quantitative measures for the management and comparison of annotated genomes. BMC Bioinformatics 10:67 |