Bioinformatics Software for Analyzing Microbial Genomes

Salzberg, Steven

Abstract

This project will support the continued development and maintenance of four bioinformatics systems, all of which are used for microbial genomics research. The most widely used of these systems, Glimmer, is used to find genes in bacteria, viruses, archaea, and simple eukaryotes. It can find over 99% of the genes in bacteria fully automatically, and it has been used as part of dozens of genome annotation efforts. The system has been distributed (free, including source code) to over 1400 academic and government laboratories and institutions. This project will support these users with continued improvements that include new features to permit Glimmer's use on incomplete genomes, improved detection of start codons, and a more user-friendly interface. The second system, PANDA, is a new system for creating non-redundant protein sequence databases, which are a key tool in genome sequence analysis. PANDA is an important resource for both prokaryotic and eukaryotic genomics research. This project will support the creation and regular updates of a comprehensive database containing proteins from all species, a specialized database of bacterial proteins, a database of mammalian proteins, and others. All databases will be freely available for download and will be regularly rebuilt with the latest genome data. The third system, TransTerm, finds transcription terminators in microbial genomes. TransTerm has been distributed for free to over 500 laboratories, and it will be extended to find new types of terminators and to recognize anti-terminators. This project will also support the maintenance of a website that contains all terminators from the latest set of completed genomes. The fourth system identifies operons in microbial genomes, using conserved synteny across species as the basis for its predictions. This project will support enhancements to the software and regular updates to the operon database, which needs to be modified to incorporate new genomes as they appear. Both the software and the operon database will be freely available to the scientific community.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 7R01LM007938-03
Application #: 7108914
Study Section: Genome Study Section (GNM)
Program Officer: Ye, Jane

Project Start: 2003-09-01
Project End: 2007-08-31
Budget Start: 2005-09-01
Budget End: 2006-08-31
Support Year: 3
Fiscal Year: 2005
Total Cost: $185,625
Indirect Cost

Institution

Name: University of Maryland College Park
Department
Type: Organized Research Units
DUNS #: 790934285

City: College Park
State: MD
Country: United States
Zip Code: 20742

Related projects


NIH 2006 R01 LM	Bioinformatics Software for Analyzing Microbial Genomes Salzberg, Steven L. / University of Maryland College Park	$181,264
NIH 2005 R01 LM	Bioinformatics Software for Analyzing Microbial Genomes Salzberg, Steven L. / University of Maryland College Park	$185,625
NIH 2004 R01 LM	Bioinformatics Software for Analyzing Microbial Genomes Salzberg, Steven L. / Institute for Genomic Research	$194,875
NIH 2003 R01 LM	Bioinformatics Software for Analyzing Microbial Genomes Salzberg, Steven L. / Institute for Genomic Research	$194,875

Publications

Phillippy, Adam M; Schatz, Michael C; Pop, Mihai (2008) Genome assembly forensics: finding the elusive mis-assembly. Genome Biol 9:R55

Salzberg, Steven L; Kingsford, Carl; Cattoli, Giovanni et al. (2007) Genome analysis linking recent European and African influenza (H5N1) viruses. Emerg Infect Dis 13:713-8

Salzberg, Steven L (2007) Genome re-annotation: a wiki solution? Genome Biol 8:102

Pertea, Mihaela; Mount, Stephen M; Salzberg, Steven L (2007) A computational survey of candidate exonic splicing enhancer motifs in the model plant Arabidopsis thaliana. BMC Bioinformatics 8:159

Schatz, Michael C; Trapnell, Cole; Delcher, Arthur L et al. (2007) High-throughput sequence alignment using Graphics Processing Units. BMC Bioinformatics 8:474

Delcher, Arthur L; Bratke, Kirsten A; Powers, Edwin C et al. (2007) Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23:673-9

Kingsford, Carl; Delcher, Arthur L; Salzberg, Steven L (2007) A unified model explaining the offsets of overlapping and near-overlapping prokaryotic genes. Mol Biol Evol 24:2091-8

Ghedin, Elodie; Wang, Shiliang; Spiro, David et al. (2007) Draft genome of the filarial nematode parasite Brugia malayi. Science 317:1756-60

Sommer, Daniel D; Delcher, Arthur L; Salzberg, Steven L et al. (2007) Minimus: a fast, lightweight genome assembler. BMC Bioinformatics 8:64

Kingsford, Carleton L; Ayanbule, Kunmi; Salzberg, Steven L (2007) Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake. Genome Biol 8:R22

Showing the most recent 10 out of 21 publications

Comments

Be the first to comment on Steven Salzberg's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: