The broad range of this project is to develop new computational tools, based on computer vision, for the determination of structures of biological macromolecules. An important problem in structural biochemistry is the determination of the 3-D structures of proteins. X-ray crystallography is one of the most widely used experimental methods for this problem, and the fitting of a molecular model into an electron density map is a very important part of the X-ray crystallographic approach.
The specific aims of the project are: 1) to increase the degree of automation in electron density map fitting procedures; 2) to develop an automated electron density fitting system, first, as a distributed system on a series of networked heterogeneous workstations, and then, as a parallel system running on a supercomputer; 3) to develop other techniques of computer vision to aid the structural biologists in analyzing and interpreting electron density maps. (These include edge detection and histogram enhancement which in turn would be used for constructing better envelopes for phase improvement by electron density modification); and 4) once the automated system is sufficiently developed, it will be used profitably for the structure determination of a number of proteins, including the Adenovirus single stranded DNA binding protein (DBP) complexed with DNA. The electron density map will be reduced in an efficient and realistic manner to obtain a skeletonized representation which allows the application of pattern recognition techniques to trace the polypeptide chain. To locate the C positions from skeletonized electron density maps, a set of templates will be used ranging from tetrahedrally coordinated carbon atoms through peptide groups and amino acid residues, to elements of secondary nature, and for matching the templates, a 3-D object recognition algorithm will be used. Once the C positions have been located, a fast query system will be used to access the entire Brookhaven Protein Data Bank to build the polypeptide chain by a fragment fitting procedure. The interpretation of electron density maps in terms of an initial molecular model is an important step in the solution of macromolecular structures by X-ray crystallographic methods, and also one of the least automated; the time span for this operation is of the order of months. Consequently, the project should be of significant interest to the protein crystallographic community, and to structural molecular biologists in general. The knowledge gained from protein structural information is highly relevant to disease and therapy and the project will have an indirect impact on structure-based drug design.