Natural selection can be challenging to study. Fitness differences that are too small to directly measure can have profound evolutionary consequences. This has motivated the creation of statistical tools for characterizing natural selection from data sets of naturally occurring genetic variation. Selection operates on phenotype but this tends to be ignored by these statistical tools. Instead, the fitness associated with each allele or genotype is often treated as a free parameter. Our research employs computational methods for predicting phenotype from DNA sequence data. This enables our statistical procedures to extract more information about selection from data sets and it facilitates studies of the impact of phenotype on evolution of the genotype. Another unconventional feature of our research is that we analyze interspecific data but frame estimates with respect to population genetics. We do this because most of evolutionary history can be studied only through interspecific comparisons and because population genetics is the natural framework within which to study selection. Our research focuses on natural selection to maintain protein structure, but our inference strategies can assess the evolutionary impact of other phenotypes. To better understand the influence of tertiary structure, we will simultaneously examine the evolutionary roles of context-dependent mutation, codon usage, and mRNA abundance. The main consequence of our more realistic evolutionary models will be better population genetic inferences about natural selection from interspecific data, but the models also have the potential to assist with applications ranging from ancestral sequence reconstruction to inferring adaptive landscapes. Simulation will help to evaluate the quality of our population genetic inferences from interspecific data and will let us determine how to improve these inferences. We will devote particular attention to the situation where populations have concurrent fitness-affecting polymorphisms that interfere with each other via the Hill-Robertson effect. Because our interspecific models are framed with respect to population genetics, we can combine interspecific and intraspecific data in a sensible way. This explicit evolutionary perspective will lead to improvement of an already successful approach for predicting which nonsynonymous variation has effects on human health.
This project will lead to improved understanding of the role that natural selection has in shaping genetic variation. Via this improved understanding, we will develop statistical techniques for identifying which variation in protein-coding genes is likely to be deleterious to human health.
|Wang, Kuangyu; Yu, Shuhui; Ji, Xiang et al. (2015) Roles of solvent accessibility and gene expression in modeling protein sequence evolution. Evol Bioinform Online 11:85-96|
|Lee, Hui-Jie; Rodrigue, Nicolas; Thorne, Jeffrey L (2015) Relaxing the Molecular Clock to Different Degrees for Different Substitution Types. Mol Biol Evol 32:1948-61|
|Lassiter, Erica S; Russ, Carsten; Nusbaum, Chad et al. (2015) Mitochondrial genome sequences reveal evolutionary relationships of the Phytophthora 1c clade species. Curr Genet 61:567-77|
|Griffing, Alexander R; Lynch, Benjamin R; Stone, Eric A (2014) Structural properties of the minimum cut of partially-supplied graphs. Discrete Appl Math 117:152-157|
|Vensko 2nd, Steven P; Stone, Eric A (2014) No evidence for a global male-specific lethal complex-mediated dosage compensation contribution to the demasculinization of the Drosophila melanogaster X chromosome. PLoS One 9:e103659|
|Griffing, Alexander R; Lynch, Benjamin R; Stone, Eric A (2013) An eigenvector interlacing property of graphs that arise from trees by Schur complementation of the Laplacian. Linear Algebra Appl 438:1078-1094|
|Nasrallah, Chris A (2013) The dynamics of alternative pathways to compensatory substitution. BMC Bioinformatics 14 Suppl 15:S2|
|Liberles, David A; Teichmann, Sarah A; Bahar, Ivet et al. (2012) The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci 21:769-85|
|Zou, Liwen; Susko, Edward; Field, Chris et al. (2012) Fitting nonstationary general-time-reversible models to obtain edge-lengths and frequencies for the barry-hartigan model. Syst Biol 61:927-40|
|Cartwright, Reed A; Hussin, Julie; Keebler, Jonathan E M et al. (2012) A family-based probabilistic method for capturing de novo mutations from high-throughput short-read sequencing data. Stat Appl Genet Mol Biol 11:|
Showing the most recent 10 out of 31 publications