The success of genome sequencing over the last four decades has resulted in a rapidly increasing gap between the number of known protein sequences and the number of known protein structures and functions. Because protein sequence on its own cannot tell us what each molecule does in cells, the large-scale absence of protein structure and function information severely hinders the progress of contemporary biological and medical studies. These gaps in understanding strongly call for efficient computational approaches for automated, yet highly accurate protein structure prediction and function annotation. The PI?s lab has a successful track record in developing and disseminating high-quality structural bioinformatics methods which have been widely used by the global community. In this project, the lab seeks to develop new advanced methods for both tertiary and quaternary protein structure prediction. Built on the tools and databases previously developed in the PI?s lab, new deep neural-network based techniques will be extended to residue-level intra- and inter-chain contact- and distance-map predictions. These predictions will then be used to constrain the conformational searching space of threading-based fragment assembly simulations, with the aim to significantly improve the accuracy and success rate of structure modeling of monomeric proteins and protein-protein interactions (PPIs), especially for the difficult targets that lack homologous templates in the Protein Data Bank. Next, the structure and PPI network information will be used to help elucidate multiple levels of biological and biomedical functions for protein molecules, including mutation-induced changes in protein stability and human disease predictions. The long- term goals of this project are to significantly improve the state of the art of protein structure prediction and to narrow the gap between the abundance of protein sequence information and the dearth of protein structure and function data, thus significantly enhancing the usefulness and impact of structural bioinformatics. Success in this project will also help reveal the general principles governing the fundamental relations across sequence, structure and function of protein molecules.
Researchers in contemporary drug industry need to use the knowledge of 3-dimensional structure of protein molecules for designing synthetic compounds to fight against human diseases. But many pharmaceutically important proteins do not have experimentally solved structures. This project seeks to develop advanced computer methods for high-quality protein structure prediction that can be used to function annotation and compound screening; these should have an important and general impact on drug discovery and human health.