Nucleic acid binding proteins are of major importance in biology since they include many of the factors that regulate gene expression and developmental processes. One major class of such factors are constituted by the family of zinc finger proteins, defined by conserved structural motifs which require zinc ions for appropriate folding into active forms. This project is focused on a genetic and molecular analysis of zinc finger proteins and their genes. The question is being approached from different perspectives. (1) How many zinc finger proteins are encoded in the genome and how are these genes distributed on the chromosomes? This question is being asked in the human with regard to the C2H2 family of zinc finger proteins. Over 200 genomic clones have been isolated; of these, over 90 clones have been sorted into 25 genes by crosshybridization, partial sequencing, and PCR amplification. Further, these loci have been mapped onto the human chromosome complement by nonradioactive in situ hybridization techniques. The results show that zinc finger genes are distributed on many chromosomes but are especially abundant on chromosome 19. By extrapolation to additional clones obtained in this and in other laboratories it can be estimated that the human genome contains at least one hundred, and probably several hundred, genes for zinc finger proteins of the C2H2 class. (2) To aid in structural and statistical analyses a database of zinc finger proteins has been established in conjunction with the DCRT and with Grant Jacobs of the MRC, Cambridge. This database allows the comparison of over 1100 finger sequences from published and unpublished sources, providing an important resource for the analysis of zinc finger genes. The application of statistical methods to the database has begun to organize zinc finger sequences into subclasses. Correlations between subclass consensus sequences and known binding sites for some zinc finger proteins forms the basis of attempts to develop general rules for the binding specificities of such proteins. Such general rules would have important predictive potential for future research.

Project Start
Project End
Budget Start
Budget End
Support Year
2
Fiscal Year
1991
Total Cost
Indirect Cost
City
State
Country
United States
Zip Code