Species phylogenies model how species split and diverge and provide important insight into fundamental biological phenomena and processes, including biodiversity, and trait evolution, while gene trees provide insight into protein structure and function as well as systems biology. Advances in sequencing technologies and assembly methods and the availability of whole-genome datasets have opened up the possibility of transformative improvements in accuracy for estimating species phylogenies and gene trees. Phylogenetic networks extend phylogenetic trees to provide an appropriate model of reticulate evolutionary histories. Reticulate evolution describes the origination of a lineage through partial merging of two ancestor lineages. Recently developed methods allow for statistical inference of phylogenetic networks in order to account for other processes that could be at play during the evolution of the genomes. However, these methods can handle fewer than a handful of genomes. This award will develop methods for estimating large-scale phylogenetic networks from sequence data as well as gene tree estimates. The award will stimulate research in computer science and statistics and will have a major impact on evolutionary biology. The award will contribute open-source code to the PhyloNet software package. Lectures and tutorials will be given to the community on the developments made in the award and on the use of PhyloNet. The award will provide ample opportunities for training students and post-doctoral fellows in cutting-edge, interdisciplinary algorithmic research.

The project will be carried out through five activities that are intertwined throughout the lifetime of the award. (1) Development of novel algorithmic techniques for scalable inference of phylogenetic networks that allow for analyzing data sets with tens and even hundreds of genomes. (2) Implementation and of all methods in the PhyloNet software package. (3) Thorough evaluation of the methods in terms of accuracy and computational requirements. (4) Mentoring and training of students and post-doctoral fellows. (5) Dissemination of the results through an open-source software package, publications in peer-reviewed journals and conference proceedings, lectures and tutorials, and course materials.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1800723
Program Officer
Mitra Basu
Project Start
Project End
Budget Start
2018-06-15
Budget End
2022-05-31
Support Year
Fiscal Year
2018
Total Cost
$750,004
Indirect Cost
Name
Rice University
Department
Type
DUNS #
City
Houston
State
TX
Country
United States
Zip Code
77005