Knowledge of evolutionary rates and dates is essential for answering fundamental questions in biology and medicine, including the antiquity of gene duplications, origins of pathogenic strains, and relative tempo of changes in genes and convergence in species. Despite decades of methodological advances, researchers face substantial challenges when conducting these analyses. Therefore, we focus on developing a new relative rate framework (RRF) that will advance beyond the current state-of-the-art to address conceptual and practical challenges in estimating evolutionary rates and dates. Using RRF we will develop much-needed methods to test hypotheses of evolutionary rate independence among lineages and to select the statistical distribution that best fits the given data. No reliable methods currently exist for either of these two purposes, which compels practitioners to make arbitrary and ad hoc choices, resulting in biases in temporal trends inferred and powerless tests of evolutionary hypotheses. We will use our newly developed methods to query empirical datasets regarding fundamental questions of rates and dates, including the hypothesized existence and prevalence of evolutionary rate correlation in closely- and distantly-related species. The statistical development of RRF will produce reliable estimates of node dates to establish robust biological patterns, and generate robust 95% confidence intervals to test hypotheses. RRF framework will be computationally efficient and scalable, with accuracy surpassing computationally-intensive methods whose usage currently requires ad hoc divide-and- conquer or data subsampling approaches when applied to larger data sets. We will also create a library of functions containing the advanced methods developed in this project, which will be directly useable on the command line and available in a graphical interface through integration with the MEGA software.
Molecular evolutionary rates and dates of evolutionary divergence events are central features of comparative studies in molecular biology. We will develop advanced methods for inference of these parameters from large genomic datasets, and conduct analyses of available empirical data to test major biological hypotheses and generate new insights. The proposed methods and their software implementation will greatly facilitate research pursued in biology and biomedicine.
|Battistuzzi, Fabia U; Tao, Qiqing; Jones, Lance et al. (2018) RelTime Relaxes the Strict Molecular Clock throughout the Phylogeny. Genome Biol Evol 10:1631-1636|
|Tamura, Koichiro; Tao, Qiqing; Kumar, Sudhir (2018) Theoretical Foundation of the RelTime Method for Estimating Divergence Times from Variable Evolutionary Rates. Mol Biol Evol 35:1770-1782|
|Hedges, S Blair; Tao, Qiqing; Walker, Mark et al. (2018) Accurate timetrees require accurate calibrations. Proc Natl Acad Sci U S A 115:E9510-E9511|
|Kumar, Sudhir; Stecher, Glen; Li, Michael et al. (2018) MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol Biol Evol 35:1547-1549|