High throughput sequencing technologies have made possible both personal human genome sequencing and rapid re-sequencing of many organisms. The dramatic increase in the throughputs of these technologies demands equal progress in the technologies used to manage their outputs. Over the last decade ontologies have emerged as indispensible tools for the management of large biomedical datasets. The Sequence Ontology (SO) is world's most widely used ontology for describing sequence annotations. The advent of rapid genome re-sequencing has made it essential that SO also provide the means to describe sequence variants. This renewal submission thus has two broad goals: (1) extend SO into the realm of genomic variant annotation, and (2) harmonize SO with recent developments in the field of biomedical ontology and genomics. Both are essential if we are to meet the data management needs of researchers seeking to exchange, compare and analyze re-sequenced genomes and their variants in the context of existing gene annotations.
The gigantic datasets produced by personal human genome sequencing present daunting challenges for data management. This proposal seeks funds to extend tools for describing genome annotations, into the realm of sequence variation. Doing so will facilitate exchange, comparisons and analyses of re-sequenced genomes.
|Singleton, Marc V; Guthery, Stephen L; Voelkerding, Karl V et al. (2014) Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am J Hum Genet 94:599-610|
|Welch, Brandon M; Loya, Salvador Rodriguez; Eilbeck, Karen et al. (2014) A Proposed Clinical Decision Support Architecture Capable of Supporting Whole Genome Sequence Information. J Pers Med 4:176-199|
|Mungall, Christopher J; Batchelor, Colin; Eilbeck, Karen (2011) Evolution of the Sequence Ontology terms and relationships. J Biomed Inform 44:87-93|
|Moore, Barry; Fan, Guozhen; Eilbeck, Karen (2010) SOBA: sequence ontology bioinformatics analysis. Nucleic Acids Res 38:W161-4|
|Gene Ontology Consortium (2010) The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res 38:D331-5|
|Reese, Martin G; Moore, Barry; Batchelor, Colin et al. (2010) A standard variation file format for human genome sequences. Genome Biol 11:R88|
|Eilbeck, Karen; Moore, Barry; Holt, Carson et al. (2009) Quantitative measures for the management and comparison of annotated genomes. BMC Bioinformatics 10:67|