This work aims to identify protein functional determinants and to compare them across the proteome to predict protein function. The approach is predicated on a phylogenomic algorithm, the Evolutionary Trace (ET), that identifies key functional residues in proteins;and on ET Annotation (ETA) algorithms, which extract from ET analysis 3D templates, describing the composition and conformation of key residues involved in binding or in catalysis, and then search in other structures for geometric matches to these 3D templates that suggest a common function. Preliminary data have extensively validated ET, both computationally and through experiments, and ETA has become a useful tool to annotate function on structural genomics proteins. Both methods, however, can still gain in sensitivity, specificity and scalability. To do so we propose in Aim 1, first, to improve the ET identification of key functional residues, by optimizing the selection of the input sequences and by a new measure of residue functional importance, and, second, to refine the selection of 3D templates.
In Aim 2, we propose a new network-based annotation diffusion method to compare all 3D template matches at once and to add in functional information from other sources, such as from proteins without known structure.
Aim 3 is experimental and it will test our predictions through mutations and assays on proteins of direct medical interest including one that controls drug resistance in bacteria and another that is a marker of drug resistance in malaria. In the long term, these results should help to focus protein engineering and drug design to the most functionally and therapeutically relevant parts of a protein, and, most broadly, link the massive and exponentially growing amounts of raw sequence and structure data to biological function and its molecular basis.

Public Health Relevance

Modern biology excels at producing volumes of basic information on the composition of our genes and on the structure of the proteins that they encode. However, much of this potentially useful information lies fallow and does not contribute to our understanding of the basic biology of disease or to the development of new drugs and treatments. The reason is that it remains difficult to know what these new genes actually do, and how they do it. This work develops computational methods to answer both questions. In so doing it should help identify the function of novel protein and help connect them to pathological processes. For example, to test some of our tools and predictions, we will experimentally study two proteins of medical interest, one that orchestrates drug resistance in bacteria, and another that marks drug resistance in malaria.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM079656-07
Application #
8537933
Study Section
Special Emphasis Panel (ZRG1-BCMB-B (03))
Program Officer
Wehrle, Janna P
Project Start
2007-04-01
Project End
2015-08-31
Budget Start
2013-09-01
Budget End
2014-08-31
Support Year
7
Fiscal Year
2013
Total Cost
$377,204
Indirect Cost
$136,179
Name
Baylor College of Medicine
Department
Genetics
Type
Schools of Medicine
DUNS #
051113330
City
Houston
State
TX
Country
United States
Zip Code
77030
Wilson, Stephen J; Wilkins, Angela D; Lin, Chih-Hsu et al. (2016) DISCOVERY OF FUNCTIONAL AND DISEASE PATHWAYS BY COMMUNITY DETECTION IN PROTEIN-PROTEIN INTERACTION NETWORKS. Pac Symp Biocomput 22:336-347
Sung, Yun-Min; Wilkins, Angela D; Rodriguez, Gustavo J et al. (2016) Intramolecular allosteric communication in dopamine D2 receptor revealed by evolutionary amino acid covariation. Proc Natl Acad Sci U S A 113:3539-44
Marciano, David C; Lua, Rhonald C; Herman, Christophe et al. (2016) Cooperativity of Negative Autoregulation Confers Increased Mutational Robustness. Phys Rev Lett 116:258104
Koire, Amanda; Katsonis, Panagiotis; Lichtarge, Olivier (2016) REPURPOSING GERMLINE EXOMES OF THE CANCER GENOME ATLAS DEMANDS A CAUTIOUS APPROACH AND SAMPLE-SPECIFIC VARIANT FILTERING. Pac Symp Biocomput 21:207-18
Regenbogen, Sam; Wilkins, Angela D; Lichtarge, Olivier (2016) COMPUTING THERAPY FOR PRECISION MEDICINE: COLLABORATIVE FILTERING INTEGRATES AND PREDICTS MULTI-ENTITY INTERACTIONS. Pac Symp Biocomput 21:21-32
Gallion, Jonathan; Wilkins, Angela D; Lichtarge, Olivier (2016) HUMAN KINASES DISPLAY MUTATIONAL HOTSPOTS AT COGNATE POSITIONS WITHIN CANCER. Pac Symp Biocomput 22:414-425
Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M et al. (2016) UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures. Nucleic Acids Res 44:D308-12
Mullany, Lisa K; Wong, Kwong-Kwok; Marciano, David C et al. (2015) Specific TP53 Mutants Overrepresented in Ovarian Cancer Impact CNV, TP53 Activity, Responses to Nutlin-3a, and Cell Survival. Neoplasia 17:789-803
Peterson, Sean M; Pack, Thomas F; Wilkins, Angela D et al. (2015) Elucidation of G-protein and β-arrestin functional selectivity at the dopamine D2 receptor. Proc Natl Acad Sci U S A 112:7097-102
Neskey, David M; Osman, Abdullah A; Ow, Thomas J et al. (2015) Evolutionary Action Score of TP53 Identifies High-Risk Mutations Associated with Decreased Survival and Increased Distant Metastases in Head and Neck Cancer. Cancer Res 75:1527-36

Showing the most recent 10 out of 52 publications