A fundamental problem in molecular biology is the structure-function relationship of proteins. To understand how structure dictates the function of a protein, it is essential to: (1) Identify functionally important surfaces on protein. At genomic and proteomic scale, it is also critical to: (2) Identify significant similarity of protein surface patterns among proteins which may have different fold structures. The inverse problem of the structure-function relationship asks: (3) How does protein function influence the folding and stability of proteins? A related general question is (4): Do geometric properties such as packing defects influence the stability and functions of proteins, e.g., for proteins from thermophilic microbes that thrive at high temperature? ? ? This project develops novel statistical models and computational methods that helps to solve these four important biological problems. The sequential Monte Calo (SMC) methodologies recently emerged in statistics show great promises. This project develops Constrained Sequential Monte Carlo (CSMC) methods specifically designed to solve these high dimensional and complex statistical inference problems with severe constraints. General strategies and theory in designing the key components are developed for successful CSMC implementation. Implemented CSMC tools are disseminated to research community freely. ? ? The results of this project enable the discovery of spatial surface motifs and uncover novel functional relations of proteins important for drug discovery. New patterns discovered can be employed to search for functionally related protein sequences, when structural information is not available. In addition, this research provides important tools for quantitatively assessing how protein function influence protein folding and stability. Insights are gained towards understanding how packing defects influence proteins stability. ? ? ?
Showing the most recent 10 out of 30 publications