In the last few years, rapid accumulation of genome sequences and protein structures has been paralleled by major advances in sequence database search methods. The powerful Position-Specific Iterating BLAST (PSI-BLAST) method developed at the NCBI formed the basis of our work on protein motif analysis. A new mode of PSI-BLAST application which includes exhaustive database search by repeating PSI-BLAST iterations to convergence with newly identified protein family members was developed and implemented in an automatic procedure. Two other new procedures, IMPALA and RPS-BLAST, allow one to search a library of protein family profiles by using an individual protein sequence as a query. The BLAST-CLUST procedure was developed to flexibly cluster proteins by sequence similarity using BLAST search outputs in the input. These methods were applied to perform a systematic survey of completely sequenced genomes and to produce a census of protein structural folds. A theoretical study on prediction of the total number of protein folds and families was performed; the estimates of approximately 1000 for the former and approximately 5000 for the latter were produced. The evolutionary history and phyletic distribution of several types of protein domains were analyzed in detail, including a variety of proteins involved in RNA metabolism and programmed cell death, the vast class of GTPases and related ATPases, P-loop kinases and a variety of other protein classes.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000061-10
Application #
6843569
Study Section
(CBB)
Project Start
Project End
Budget Start
Budget End
Support Year
10
Fiscal Year
2003
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Ng, C Leong; Waterman, David G; Koonin, Eugene V et al. (2009) Conformational flexibility and molecular interactions of an archaeal homologue of the Shwachman-Bodian-Diamond syndrome protein. BMC Struct Biol 9:32
Yutin, Natalya; Wolf, Maxim Y; Wolf, Yuri I et al. (2009) The origins of phagocytosis and eukaryogenesis. Biol Direct 4:9
Wolf, Yuri I; Novichkov, Pavel S; Karev, Georgy P et al. (2009) Inaugural Article: The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc Natl Acad Sci U S A 106:7273-80
Koonin, Eugene V; Aravind, L (2009) Comparative genomics, evolution and origins of the nuclear envelope and nuclear pore complex. Cell Cycle 8:1984-5
Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V (2009) Comprehensive comparative-genomic analysis of type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes. Biol Direct 4:19
Makarova, Kira S; Wolf, Yuri I; van der Oost, John et al. (2009) Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements. Biol Direct 4:29
Galperin, Michael Y (2008) Telling bacteria: do not LytTR. Structure 16:657-9
Hou, Shaobin; Makarova, Kira S; Saw, Jimmy H W et al. (2008) Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia. Biol Direct 3:26
Basu, Malay Kumar; Carmel, Liran; Rogozin, Igor B et al. (2008) Evolution of protein domain promiscuity in eukaryotes. Genome Res 18:449-61
Elkins, James G; Podar, Mircea; Graham, David E et al. (2008) A korarchaeal genome reveals insights into the evolution of the Archaea. Proc Natl Acad Sci U S A 105:8102-7

Showing the most recent 10 out of 50 publications