Recent advances in DNA sequencing techniques have led to the determination of many entire genome sequences. New insights into the biological functions and evolution of these organisms has been gained from this information. A new qualitatively different kind of analysis is possible with complete genome sequence data - that is, the evaluation of apparently missing genes and the potential consequences of their loss on the biology of the organism. To systematically identify potentially missing genes one must first classify genes from a number of organisms into groups of orthologs. Orthologs are genes from different organisms derived from the same gene in the closest common ancestor of these organisms. They are thus the genes most likely to perform biologically similar functions and often share the greatest sequence similarity. Once these classifications are made, one simply examines the phylogenetic pattern in the ortholog groups to identify potentially lost genes in the studied organism as compared to the reference organisms. In addition, global properties of proteins may be studied from a genomic perspective, for example, the relationship of sequence length and conservation. These approaches may also be used to study bacterial and viral pathogens. We will be focussing initially on influenza virus and using complete genome sequence to better understand the epidemiology and natural history of the virus. This understanding may be useful in improving surveillance, formulating vaccines, and preparing for pandemics.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
National Library of Medicine
Zip Code
Benson, Dennis A; Clark, Karen; Karsch-Mizrachi, Ilene et al. (2015) GenBank. Nucleic Acids Res 43:D30-5
Benson, Dennis A; Clark, Karen; Karsch-Mizrachi, Ilene et al. (2014) GenBank. Nucleic Acids Res 42:D32-7
Koonin, Eugene V; Landweber, Laura F; Lipman, David J (2013) Biology Direct: celebrating 7 years of open, published peer review. Biol Direct 8:11
Du, Xiangjun; Lipman, David J; Cherry, Joshua L (2013) Why does a protein's evolutionary rate vary over time? Genome Biol Evol 5:494-503
Carter, Donald M; Bloom, Chalise E; Nascimento, Eduardo J M et al. (2013) Sequential seasonal H1N1 influenza virus infections protect ferrets against novel 2009 H1N1 influenza virus. J Virol 87:1400-10
Benson, Dennis A; Cavanaugh, Mark; Clark, Karen et al. (2013) GenBank. Nucleic Acids Res 41:D36-42
Carter, Donald M; Lu, Hai-Rong; Bloom, Chalise E et al. (2012) Complex patterns of human antisera reactivity to novel 2009 H1N1 and historical H1N1 influenza strains. PLoS One 7:e39435
Benson, Dennis A; Karsch-Mizrachi, Ilene; Clark, Karen et al. (2012) GenBank. Nucleic Acids Res 40:D48-53
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A et al. (2012) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 40:D13-25
Boratyn, Grzegorz M; Schaffer, Alejandro A; Agarwala, Richa et al. (2012) Domain enhanced lookup time accelerated BLAST. Biol Direct 7:12

Showing the most recent 10 out of 21 publications