Conserved amino acid sequence motifs in different groups of NTP-utilizing enzymes were studied using computer methods of sequence analysis to the end of predicting NTPase activity of unexplored proteins, designing schemes for identification of NTPases in sequence databases and generating a sequence-based classification of this type of enzymes. NTPases are characterized by well-defined conserved motifs that are implicated in substrate binding and hydrolysis. Amino acid sequence databases were searched for the so-called A motif that is involved in phosphate binding and the resulting set of proteins was explored in detail with respect to their similarities with other proteins, and the available data on NTPase activity. A new superfamily of (putative) DNA-dependent ATPases was described that includes the ATPase domains of prokaryotic NtrC-related transcription regulators, MCM proteins involved in the initiation of eukaryotic DNA replication, and a group of uncharacterized bacterial and chloroplast proteins. MCM proteins were shown to contain a modified form of the ATP-binding motif and are predicted to mediate ATP-dependent opening of double-stranded DNA in the replication origins. In a second line of investigation, it was demonstrated that the products of unidentified open reading frames from Marchantia mitochondria and from yeast, and a domain of a baculovirus protein involved in viral DNA replication are related to the superfamily III of DNA and RNA helicases that previously has been known to include only proteins of small viruses. Comparison of the multiple alignments showed that the proteins of the NtrC superfamily and the helicases of superfamily III share three related sequence motifs tightly packed in the ATPase domain that consists of 100 - 150 amino acid residues. A similar array of conserved motifs was found in the family of DnaA-related ATPases. It is hypothesized that the three large groups of nucleic acid-dependent ATPases have similar structure of the core ATPase domain and have evolved from a common ancestor. Several previously uncharacterized proteins were shown to contain conserved sequence motifs typical of the helicase superfamilies I or II and were predicted to possess helicase activity. A general classification of DNA and RNA helicases based on sequence comparison was outlined and an attempt was made to derive identifying sequence pattern for each large group. The significance of the project is in the prediction of NTPase activity for many proteins with unknown functions, characterization of allowed deviations in NTP-binding motifs, derivation of identifying patterns for different groups of NTPases, and development of a sequence-based classification for a vast enzyme class.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000037-02
Application #
3781275
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
2
Fiscal Year
1993
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code