This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. A major unsolved problem for structure-function linkage using computational prediction is that while we can accurately cluster protein sequences and structures with good statistical significance based on many types of similarity metrics, how those clusters link to functional classes is not clear. Although simple approaches such as ortholog prediction can achieve good results for sequences that are closely similar or that contain readily identifiable motifs that distinguish functional classes, for many protein superfamilies successful prediction is far from trivial. This is the case for the functionally diverse superfamilies in the SFLD. These are homologous sets of enzymes that carry out different chemical transformations, using different substrates, but all share a specific chemical functionality or partial reaction. The main purpose of the SFLD is to aid researchers in the curation of these types of superfamilies, to help in the identification of new members of these superfamilies, and to provide an explicit structure-function mapping for these enzymes. Because the different functional families in a given superfamily look similar but perform different specific reactions, they are difficult to annotate and easy to misannotate, showing levels of misannotation as high as 80% in the archival databases Genbank NR and TrEMBL. Because sequence information is still coming available in large volumes, automated methods are required to update the SFLD superfamilies with newly determined sequences and assign them to the appropriate functional families. Clearly, improved methods for achieving these functional assignments are urgently needed. Development of an approach to achieve this has been a major focus of the RBVI in collaboration with the group of Prof. Jacquelyn Fetrow of Wake Forest University. The active site profiling methods developed by Dr. Fetrow have now been integrated with an approach developed in the Babbitt lab, Genetic Algorithm Search for Patterns in Structures: GASPS, to automatically determine 3D templates capable of distinguishing new superfamily members for the purpose of automatically assigning sequences to the specific functional families to which they belong. GASPS will be combined with Fetrow's methods to create sequence and structural motifs for automated clustering of SFLD data. The core elements of the method include a motif-generating technology called """"""""Fuzzy Functional Forms"""""""", (FFF), implemented by the tool Protein Active Site Structure Search (PASSS), and the Deacon Active Site Profiler (DASP) which uses three-dimensional, or structure-based, active-site profiling to identify residues located in the spatial environment around the active site. PASSS uses the FFF technology, describing a proteins functional site by the distances between the alpha carbons of three key residues important to the functional site chemistry and the alpha carbons of adjacent residues. Based on the premise that functionally related proteins should have structural similarity at the functional site, PASSS returns related proteins to the starting known functional site. DASP expands on this, extracting the residues that are found in the vicinity of the key residues for each protein, creating motifs from these fragments, and using these fragments to search all sequences in a database to return proteins that may share this function. Use of these tools together, and in an iterative fashion, provides a quick method to putatively functionally characterize both structures and sequences. Preliminary results from this project show exceptional accuracy in distinguishing functionally diverse families in the enolase and the kinase superfamily. The former is one of the annotated superfamilies in the SFLD that serves as a challenging test system for this type of automated effort.

National Institute of Health (NIH)
National Center for Research Resources (NCRR)
Biotechnology Resource Grants (P41)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-D (40))
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Schools of Pharmacy
San Francisco
United States
Zip Code
Kozak, John J; Gray, Harry B; Garza-López, Roberto A (2018) Relaxation of structural constraints during Amicyanin unfolding. J Inorg Biochem 179:135-145
Alamo, Lorenzo; Pinto, Antonio; Sulbarán, Guidenn et al. (2018) Lessons from a tarantula: new insights into myosin interacting-heads motif evolution and its implications on disease. Biophys Rev 10:1465-1477
Portioli, Corinne; Bovi, Michele; Benati, Donatella et al. (2017) Novel functionalization strategies of polymeric nanoparticles as carriers for brain medications. J Biomed Mater Res A 105:847-858
Alamo, Lorenzo; Koubassova, Natalia; Pinto, Antonio et al. (2017) Lessons from a tarantula: new insights into muscle thick filament and myosin interacting-heads motif structure and function. Biophys Rev 9:461-480
Nguyen, Hai Dang; Yadav, Tribhuwan; Giri, Sumanprava et al. (2017) Functions of Replication Protein A as a Sensor of R Loops and a Regulator of RNaseH1. Mol Cell 65:832-847.e4
Sofiyev, Vladimir; Kaur, Hardeep; Snyder, Beth A et al. (2017) Enhanced potency of bivalent small molecule gp41 inhibitors. Bioorg Med Chem 25:408-420
Viswanath, Shruthi; Chemmama, Ilan E; Cimermancic, Peter et al. (2017) Assessing Exhaustiveness of Stochastic Sampling for Integrative Modeling of Macromolecular Structures. Biophys J 113:2344-2353
Chu, Shidong; Zhou, Guangyan; Gochin, Miriam (2017) Evaluation of ligand-based NMR screening methods to characterize small molecule binding to HIV-1 glycoprotein-41. Org Biomol Chem 15:5210-5219
Nekouzadeh, Ali; Rudy, Yoram (2016) Conformational changes of an ion-channel during gating and emerging electrophysiologic properties: Application of a computational approach to cardiac Kv7.1. Prog Biophys Mol Biol 120:18-27
Towse, Clare-Louise; Vymetal, Jiri; Vondrasek, Jiri et al. (2016) Insights into Unfolded Proteins from the Intrinsic ?/? Propensities of the AAXAA Host-Guest Series. Biophys J 110:348-361

Showing the most recent 10 out of 508 publications