This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. Primary support for the subproject and the subproject's principal investigator may have been provided by other sources, including other NIH sources. The Total Cost listed for the subproject likely represents the estimated amount of Center infrastructure utilized by the subproject, not direct funding provided by the NCRR grant to the subproject or subproject staff. A major unsolved problem for structure-function linkage using computational prediction is that while we can accurately cluster protein sequences and structures with good statistical significance based on many types of similarity metrics, how those clusters link to functional classes is not clear. Although simple approaches such as ortholog prediction can achieve good results for sequences that are closely similar or that contain readily identifiable motifs that distinguish functional classes, for many protein superfamilies successful prediction is far from trivial. This is the case for the functionally diverse superfamilies in the SFLD. These are homologous sets of enzymes that carry out different chemical transformations, using different substrates, but all share a specific chemical functionality or partial reaction. The main purpose of the SFLD is to aid researchers in the curation of these types of superfamilies, to help in the identification of new members of these superfamilies, and to provide an explicit structure-function mapping for these enzymes. Because the different functional families in a given superfamily look similar but perform different specific reactions, they are difficult to annotate and easy to misannotate, showing levels of misannotation as high as 80% in the archival databases Genbank NR and TrEMBL. Because sequence information is still coming available in large volumes, automated methods are required to update the SFLD superfamilies with newly determined sequences and assign them to the appropriate functional families. Clearly, improved methods for achieving these functional assignments are urgently needed. Development of an approach to achieve this has been a major focus of the RBVI in collaboration with the group of Prof. Jacquelyn Fetrow of Wake Forest University. The active site profiling methods developed by Dr. Fetrow have now been integrated with an approach developed in the Babbitt lab, Genetic Algorithm Search for Patterns in Structures: GASPS, to automatically determine 3D templates capable of distinguishing new superfamily members for the purpose of automatically assigning sequences to the specific functional families to which they belong. GASPS will be combined with Fetrow's methods to create sequence and structural motifs for automated clustering of SFLD data. The core elements of the method include a motif-generating technology called """"""""Fuzzy Functional Forms"""""""", (FFF), implemented by the tool Protein Active Site Structure Search (PASSS), and the Deacon Active Site Profiler (DASP) which uses three-dimensional, or structure-based, active-site profiling to identify residues located in the spatial environment around the active site. PASSS uses the FFF technology, describing a proteins functional site by the distances between the alpha carbons of three key residues important to the functional site chemistry and the alpha carbons of adjacent residues. Based on the premise that functionally related proteins should have structural similarity at the functional site, PASSS returns related proteins to the starting known functional site. DASP expands on this, extracting the residues that are found in the vicinity of the key residues for each protein, creating motifs from these fragments, and using these fragments to search all sequences in a database to return proteins that may share this function. Use of these tools together, and in an iterative fashion, provides a quick method to putatively functionally characterize both structures and sequences. Preliminary results from this project show exceptional accuracy in distinguishing functionally diverse families in the enolase and the kinase superfamily. The former is one of the annotated superfamilies in the SFLD that serves as a challenging test system for this type of automated effort. This pipeline is now being applied to the Kinase superfamily in an effort to add this superfamily to the SFLD using an automated approach.

National Institute of Health (NIH)
National Center for Research Resources (NCRR)
Biotechnology Resource Grants (P41)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-D (40))
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Schools of Pharmacy
San Francisco
United States
Zip Code
Nekouzadeh, Ali; Rudy, Yoram (2016) Conformational changes of an ion-channel during gating and emerging electrophysiologic properties: Application of a computational approach to cardiac Kv7.1. Prog Biophys Mol Biol 120:18-27
Towse, Clare-Louise; Vymetal, Jiri; Vondrasek, Jiri et al. (2016) Insights into Unfolded Proteins from the Intrinsic ϕ/ψ Propensities of the AAXAA Host-Guest Series. Biophys J 110:348-61
Bowen, Alice M; Jones, Michael W; Lovett, Janet E et al. (2016) Exploiting orientation-selective DEER: determining molecular structure in systems containing Cu(ii) centres. Phys Chem Chem Phys 18:5981-94
Rosenberg, Masha M; Redfield, Alfred G; Roberts, Mary F et al. (2016) Substrate and Cofactor Dynamics on Guanosine Monophosphate Reductase Probed by High Resolution Field Cycling 31P NMR Relaxometry. J Biol Chem 291:22988-22998
Parsonage, Derek; Sheng, Fang; Hirata, Ken et al. (2016) X-ray structures of thioredoxin and thioredoxin reductase from Entamoeba histolytica and prevailing hypothesis of the mechanism of Auranofin action. J Struct Biol 194:180-90
Forman, Stuart A; Miller, Keith W (2016) Mapping General Anesthetic Sites in Heteromeric γ-Aminobutyric Acid Type A Receptors Reveals a Potential For Targeting Receptor Subtypes. Anesth Analg 123:1263-1273
Towse, Clare-Louise; Rysavy, Steven J; Vulovic, Ivan M et al. (2016) New Dynamic Rotamer Libraries: Data-Driven Analysis of Side-Chain Conformational Propensities. Structure 24:187-99
Sato, Daisuke; Shannon, Thomas R; Bers, Donald M (2016) Sarcoplasmic Reticulum Structure and Functional Properties that Promote Long-Lasting Calcium Sparks. Biophys J 110:382-90
Kozak, John J; Gray, Harry B; Garza-López, Roberto A (2016) Cytochrome unfolding pathways from computational analysis of crystal structures. J Inorg Biochem 155:44-55
Papale, Alessandro; Morella, Ilaria Maria; Indrigo, Marzia Tina et al. (2016) Impairment of cocaine-mediated behaviours in mice by clinically relevant Ras-ERK inhibitors. Elife 5:

Showing the most recent 10 out of 493 publications