All species are related to one another through an unknown genealogical tree: the `Tree of Life.' Genealogical trees for small parts of the Tree of Life are called `phylogenies' by biologists; there is more interest in phylogenies today than at any time in the past. Phylogenies are crucial in fields such as epidemiology where they are used to track the spread of infectious disease and molecular biology where they are used to understand how molecular pathways developed. The `Tree of Life' can be discovered by comparing the characteristics of different organisms. Traditionally, phylogenies were reconstructed by comparing anatomical traits across different species. More recently, DNA sequence information from individual genes, or even full genomes, are compared to estimate phylogenies. This project will continue the development of the RevBayes computer program that biologists use to reconstruct phylogenies. The program takes as input the comparative information from the organisms of interest. The output of the program is the probabilities of the best phylogenies that explain the comparative data. Specifically, the project will improve the speed, usability, and reliability of the program.
RevBayes is the successor of the MrBayes program which is widely used by biologists to estimate phylogeny. However, the RevBayes program represents a significant departure from MrBayes. The RevBayes program implements an R-like language to describe statistical models. The model is represented in computer memory as a graph in which the vertices of the graph are the parameters and the edges represent the dependencies between parameters. RevBayes uses Markov chain Monte Carlo to approximate the posterior probability distribution of parameters. This project will improve the RevBayes program in several signficant ways: (1) unit and integration testing will be implemented to improve the reliability of the program; (2) computational resources, such as multiple cores or GPUs, will be taken better advantage of, to improve the speed of the program; (3) a cross-platform graphical user interface will be developed to improve the usability of the program; (4) output will be bundled to improve the reproducibility of the program; and (5) the program will be made to work with other software programs, to improve its interoperability. Finally, several workshops will be hosted by the participants in which phylogenetic theory will be described with emphasis of application of the theory to real-world problems using RevBayes. The source code can be found on the Github site at http://revbayes.github.io/about.html
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.