High throughput sequencing technologies have made possible both personal human genome sequencing and rapid re-sequencing of many organisms. The dramatic increase in the throughputs of these technologies demands equal progress in the technologies used to manage their outputs. Over the last decade ontologies have emerged as indispensible tools for the management of large biomedical datasets. The Sequence Ontology (SO) is world's most widely used ontology for describing sequence annotations. The advent of rapid genome re-sequencing has made it essential that SO also provide the means to describe sequence variants. This renewal submission thus has two broad goals: (1) extend SO into the realm of genomic variant annotation, and (2) harmonize SO with recent developments in the field of biomedical ontology and genomics. Both are essential if we are to meet the data management needs of researchers seeking to exchange, compare and analyze re-sequenced genomes and their variants in the context of existing gene annotations.

Public Health Relevance

The gigantic datasets produced by personal human genome sequencing present daunting challenges for data management. This proposal seeks funds to extend tools for describing genome annotations, into the realm of sequence variation. Doing so will facilitate exchange, comparisons and analyses of re-sequenced genomes.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG004341-07
Application #
8462288
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Bonazzi, Vivien
Project Start
2007-08-15
Project End
2014-04-30
Budget Start
2013-05-01
Budget End
2014-04-30
Support Year
7
Fiscal Year
2013
Total Cost
$279,985
Indirect Cost
$85,435
Name
University of Utah
Department
Miscellaneous
Type
Schools of Medicine
DUNS #
009095365
City
Salt Lake City
State
UT
Country
United States
Zip Code
84112
Singleton, Marc V; Guthery, Stephen L; Voelkerding, Karl V et al. (2014) Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am J Hum Genet 94:599-610
Welch, Brandon M; Loya, Salvador Rodriguez; Eilbeck, Karen et al. (2014) A Proposed Clinical Decision Support Architecture Capable of Supporting Whole Genome Sequence Information. J Pers Med 4:176-199
Mungall, Christopher J; Batchelor, Colin; Eilbeck, Karen (2011) Evolution of the Sequence Ontology terms and relationships. J Biomed Inform 44:87-93
Moore, Barry; Fan, Guozhen; Eilbeck, Karen (2010) SOBA: sequence ontology bioinformatics analysis. Nucleic Acids Res 38:W161-4
Gene Ontology Consortium (2010) The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res 38:D331-5
Reese, Martin G; Moore, Barry; Batchelor, Colin et al. (2010) A standard variation file format for human genome sequences. Genome Biol 11:R88
Eilbeck, Karen; Moore, Barry; Holt, Carson et al. (2009) Quantitative measures for the management and comparison of annotated genomes. BMC Bioinformatics 10:67