The Division of Materials Research and the Division of Mathematical Sciences contribute funding to this award which falls under the NSF-wide Mathematical Sciences Priority Area and contributes to cyberinfrastructure. This award supports theoretical and computational research in the area of protein folding and the solution of inverse problems. The PI plans to apply a successful solution search strategy in the area of phase retrieval to the problem of protein folding. The key elements of this method are constraint projections in a Euclidean space that with minimal effort restore a particular constraint to an arbitrary input point. Many problems can be formulated in terms of their solution points being in the intersection of just two constraint sets. For such problems a dynamical system can be defined in terms of the corresponding constraint projections with the solution to the set intersection problem encoded in its fixed points. Constraint based algorithms are the method of choice in phase retrieval and may offer significant advantages, over mainstream sampling algorithms, in protein structure prediction. As in phase retrieval, where a significant computational advantage is conferred by overdetermined constraint sets, a similar gain is expected when folding sequences that are well designed. Experiments with simple heteropolymer models of proteins, where the two constraints correspond to chain geometry and monomer packing, show promise that this approach can be extended to realistic models. This project will also develop a novel form of distributed computing made possible by the chaotic dynamics of the constraint based search.
The next generation of scientists and engineers will increasingly rely on shared data bases and standardized computing protocols in the conduct of their work. A significant component of this project is the development of a miniature realization of such a work environment called "semiprotein world". Semiproteins are model proteins with highly simplified properties, but which pose many of the same challenges posed by real proteins. Through a collection of software tools, including a web-based semiprotein data bank, semiprofessional researchers with web access will be able to design and fold semiproteins, and then deposit their findings in the data base. The design of semiprotein world will involve on-site participation of Ithaca area high school students and undergraduates in the Cornell Center for Materials Research NSF-REU program.
NON-TECHNICAL SUMMARY:
The Division of Materials Research and the Division of Mathematical Sciences contribute funding to this award which falls under the NSF-wide Mathematical Sciences Priority Area and contributes to cyberinfrastructure. This award supports theoretical and computational research in the area of protein folding. Proteins are major constituents of biological cells and play important roles in structure and function in living organisms. In order to carry out their biological function, the protein undergoes a kind of self-assemble to assume a particular shape or fold. The shape of a folded protein, how it folds and how it does it so quickly are key fundamental questions in understanding its function. Currently exiting computer simulation methods are very computationally intense. Here the PI will exploit fundamental connections between the problem of protein folding and microscopies that seek to develop an image from imperfect data, e.g. x-ray diffraction, to further develop a powerful new algorithm for protein folding to overcome the barrier of long simulation time. Preliminary results suggest that the algorithm will be far more computationally efficient than current methods.
The next generation of scientists and engineers will increasingly rely on shared data bases and standardized computing protocols in the conduct of their work. A significant component of this project is the development of a miniature realization of such a work environment called "semiprotein world". Semiproteins are model proteins with highly simplified properties, but which pose many of the same challenges posed by real proteins. Through a collection of software tools, including a web-based semiprotein data bank, semiprofessional researchers with web access will be able to design and fold semiproteins, and then deposit their findings in the data base. The design of semiprotein world will involve on-site participation of Ithaca area high school students and undergraduates in the Cornell Center for Materials Research NSF-REU program.