High throughput sequencing technologies have made possible both personal human genome sequencing and rapid re-sequencing of many organisms. The dramatic increase in the throughputs of these technologies demands equal progress in the technologies used to manage their outputs. Over the last decade ontologies have emerged as indispensible tools for the management of large biomedical datasets. The Sequence Ontology (SO) is world's most widely used ontology for describing sequence annotations. The advent of rapid genome re-sequencing has made it essential that SO also provide the means to describe sequence variants. This renewal submission thus has two broad goals: (1) extend SO into the realm of genomic variant annotation, and (2) harmonize SO with recent developments in the field of biomedical ontology and genomics. Both are essential if we are to meet the data management needs of researchers seeking to exchange, compare and analyze re-sequenced genomes and their variants in the context of existing gene annotations.

Public Health Relevance

The gigantic datasets produced by personal human genome sequencing present daunting challenges for data management. This proposal seeks funds to extend tools for describing genome annotations, into the realm of sequence variation. Doing so will facilitate exchange, comparisons and analyses of re-sequenced genomes.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Bonazzi, Vivien
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Utah
Schools of Medicine
Salt Lake City
United States
Zip Code
Desvignes, T; Batzel, P; Berezikov, E et al. (2015) miRNA Nomenclature: A View Incorporating Genetic Origins, Biosynthetic Pathways, and Sequence Variants. Trends Genet 31:613-626
Cunningham, Fiona; Moore, Barry; Ruiz-Schultz, Nicole et al. (2015) Improving the Sequence Ontology terminology for genomic variant annotation. J Biomed Semantics 6:32
Welch, Brandon M; Eilbeck, Karen; Del Fiol, Guilherme et al. (2014) Technical desiderata for the integration of genomic data with clinical decision support. J Biomed Inform 51:3-7
Welch, Brandon M; Loya, Salvador Rodriguez; Eilbeck, Karen et al. (2014) A proposed clinical decision support architecture capable of supporting whole genome sequence information. J Pers Med 4:176-99
Singleton, Marc V; Guthery, Stephen L; Voelkerding, Karl V et al. (2014) Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am J Hum Genet 94:599-610
Mungall, Christopher J; Batchelor, Colin; Eilbeck, Karen (2011) Evolution of the Sequence Ontology terms and relationships. J Biomed Inform 44:87-93
Reese, Martin G; Moore, Barry; Batchelor, Colin et al. (2010) A standard variation file format for human genome sequences. Genome Biol 11:R88
Gene Ontology Consortium (2010) The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res 38:D331-5
Moore, Barry; Fan, Guozhen; Eilbeck, Karen (2010) SOBA: sequence ontology bioinformatics analysis. Nucleic Acids Res 38:W161-4
Eilbeck, Karen; Moore, Barry; Holt, Carson et al. (2009) Quantitative measures for the management and comparison of annotated genomes. BMC Bioinformatics 10:67