The goal of this project is to develop a computer algorithm that accurately predicts the tertiary fold of a protein given only its amino acid sequence and sparse experimental constraints or an imperfect secondary structure prediction. It will use an evolutionary optimization algorithm, a detailed atomic representation, and a proprietary method for deriving a protein energy function.
The specific aims of Phase II are: (1) improve the representation of the problem; (2) improve the optimization algorithm; (3) parameterize a more accurate protein energy function; and (4) validate the prototype algorithm and energy function on a large sample of known protein structures and in blind structure prediction experiments. In Phase III the algorithm will be developed into commercial software products for the pharmaceutical and biotechnology industries. The long-term goal is to help satisfy the rapidly growing demand for the accurate prediction of protein structures from sequence information. This demand arises not only from the exponential growth of the database of sequences, but also from the need to understand the structure and function of newly discovered gene products known to be involved in human disease.
Commercial applications include protein structure determination, structure- based drug design and protein engineering in the pharmaceutical and biotechnology industries and in basic academic research.