The open reading frame transcribed from the unr gene (immediately upstream of N-ras) in mammals consists of multiple repeats similar to the cold-shock domain, a putative DNA-binding motif found in prokaryotic cold-shock proteins and eukaryotic DNA-binding proteins. Alignment of the CSD sequences of unr with those from other proteins reveals a core of similarity for which a consistent secondary structure prediction can be derived. This prediction suggests that the CSD consists primarily of beta-sheet, in contrast to most known eukaryotic DNA-binding proteins. Sequence analysis of the 3' end of the guinea pig unr gene shows that the core of one CSD repeat is encoded in a single exon, consistent with the modular assembly of the gene from ancestral CSD-coding units (Doniger et al. 1992). Further sequence analysis has shown that there is a short motif of 8 amino acids, corresponding to the RNP-1 motif found in canonical RNA-binding domains (Landsman, 1992). The CSD family of proteins, which includes several transcription factors which have been shown to bind specifically to DNA, has now been identified to contain a motif similar to RNP-1. A non-redundant protein sequence database was searched with regular expressions and with a weight/residue position matrix of the RNP-1 motif resulting in the identification of numerous known members of the RNA-binding family of proteins. In addition, the search identified that the CSD-containing family of proteins includes a motif which is almost identical to the RNP-1 motif. A determination of the statistical significance of this analysis showed that the RNP-1 motifs from these two families of proteins are indeed similar. It is conceivable that the RNP-1 in the CSD-containing proteins enables them to function as both double- and single-stranded, DNA- and RNA-binding proteins. This suggests that the CSD-containing proteins could be involved in transcription as well as in gene regulation post- transcriptionally by binding RNA. The initial phases of modelling a CSD based on the crystal structure of an RNA-binding protein have been attempted.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000038-02
Application #
3781276
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
2
Fiscal Year
1993
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code