Modeling evolution of functional context in proteins

Pollock, David

Abstract

The structure, function, and interactions of proteins produce evolutionary patterns that are imprinted on protein sequences. Here, mitochondrial and nuclear sequences will be used to study evolutionary processes and develop understanding of how proteins evolve in the context of structural, energetic, and functional constraints. Improved models of protein evolution will be developed and informed by this deeper understanding, and their utility in predicting mutational effects and structural features will be evaluated. They will also be used to better predict adaptiv bursts and levels of convergence and coevolution among residues, particularly in multigene families. This research is motivated by insights from previous research. First, it is expected from evolutionary simulations that substitution probabilities at individual positions in a protein fluctuate in time due to epistasis (interactions with substitutions at other sites in the same or other proteins). These expectations are supported by strong evidence that substitution processes do regularly fluctuate with time in real proteins. However, current models of protein evolution do not usually allow substitution processes to fluctuate with time, and levels of amino acid convergence in proteins deviate substantially from expectations for such models. Because of this, incorporating such fluctuations is a key feature of the proposed models. Second, current approaches that incorporate structure into evolutionary studies tend to use de novo prediction or pseudo energy potentials to predict the acceptability of substitutions, but these methods are not especially accurate for evolutionary analysis, which includes sequences that have diverged substantially from the sequences of known protein structures. To account for this, rather than allowing such predictions to stand alone, they will be incorporated probabilistically into empirica substitution models to varying degrees depending on expected predictive accuracy and distance from any sequences with known structure. Third, a Bayesian approach to building complex evolutionary models was recently developed that is designed to allow relatively easy computation of processes that fluctuate among sites and over time. This approach using what is called partial sampling of substitution histories makes the proposed methodology feasible. It is expected that the proposed study will make significant improvements in understanding of molecular evolution and how it relates to structure and function. One expected result of this study will be better predictions of mutational effects, which will lead to an improved ability to identify disease-causing mutations in human genome and exome sequencing studies. It is further expected that predictions of structural features when they are unknown will be improved, and researchers will be able to better understand how ancestral functional changes in proteins have arisen through adaptive sequence change.

Public Health Relevance

The proposed research is relevant to public health because it will develop new and more accurate methods for extracting information from comparative genomic data that will inform on protein structure and function and how they relate to phenotypes of disease-related mutations in humans. Such predictions will be useful in genetic studies of human disease, and also in studies that may attempt to modify protein function through drugs to ameliorate disease. In general, it will improve our basic understanding of how and why proteins work the way they do, improving our ability to make intelligent decisions in protein-related health research.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM083127-08
Application #: 9637392
Study Section: Genetic Variation and Evolution Study Section (GVE)
Program Officer: Janes, Daniel E

Project Start: 2009-03-01
Project End: 2021-02-28
Budget Start: 2019-03-01
Budget End: 2021-02-28
Support Year: 8
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: University of Colorado Denver
Department: Biochemistry
Type: Schools of Medicine
DUNS #: 041096314

City: Aurora
State: CO
Country: United States
Zip Code: 80045

Related projects


NIH 2019 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver
NIH 2018 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver
NIH 2017 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver	$321,926
NIH 2016 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver
NIH 2012 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver	$321,190
NIH 2011 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver	$345,462
NIH 2010 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver	$301,692
NIH 2010 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver	$23,196
NIH 2009 R01 GM	Modeling evolution of functional context in proteins Pollock, David D. / University of Colorado Denver	$289,042

Publications

Goldstein, Richard A; Pollock, David D (2017) Sequence entropy of folding and the absolute rate of amino acid substitutions. Nat Ecol Evol 1:1923-1930

Goldstein, Richard A; Pollock, David D (2016) The tangled bank of amino acids. Protein Sci 25:1354-62

Goldstein, Richard A; Pollard, Stephen T; Shah, Seena D et al. (2015) Nonadaptive Amino Acid Convergence Rates Decrease over Time. Mol Biol Evol 32:1373-81

Li, Cai; Zhang, Yong; Li, Jianwen et al. (2014) Two Antarctic penguin genomes reveal insights into their evolutionary history and molecular changes related to the Antarctic environment. Gigascience 3:27

Wacholder, Aaron C; Cox, Corey; Meyer, Thomas J et al. (2014) Inference of transposable element ancestry. PLoS Genet 10:e1004482

Castoe, Todd A; de Koning, A P Jason; Hall, Kathryn T et al. (2013) The Burmese python genome reveals the molecular basis for extreme adaptation in snakes. Proc Natl Acad Sci U S A 110:20645-50

Nakayama, Maki; Castoe, Todd; Sosinowski, Tomasz et al. (2012) Germline TRAV5D-4 T-cell receptor sequence targets a primary insulin peptide of NOD mice. Diabetes 61:857-65

de Koning, A P Jason; Gu, Wanjun; Castoe, Todd A et al. (2012) Phylogenetics, likelihood, evolution and complexity. Bioinformatics 28:2989-90

Pollock, David D; Thiltgen, Grant; Goldstein, Richard A (2012) Amino acid coevolution induces an evolutionary Stokes shift. Proc Natl Acad Sci U S A 109:E1352-9

Yokoyama, Ken Daigoro; Pollock, David D (2012) SP transcription factor paralogs and DNA-binding sites coevolve and adaptively converge in mammals and birds. Genome Biol Evol 4:1102-17

Showing the most recent 10 out of 19 publications

Comments

Be the first to comment on David Pollock's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: