Reconciliation analysis is a fundamental method in the study of species and genes, hosts and parasites, and geographical areas and species. While these applications are different, the underlying mathematical and computational problems are analogous. Genes and species interact through complex evolutionary processes such as gene duplication, horizontal gene transfer, and gene loss. Parasites and their hosts coevolve through processes including both contemporaneous and independent speciation, host switch, and extinction. And, species and their geographical habitats interact over geological time scales through vicariance, sympatric speciation, dispersal, and extinction. Consequently, the phylogenetic trees for genes and species, parasites and hosts, and species and their geographic regions are inherently incongruent. This research will develop new algorithms, visualization methods, and software tools for studying the evolutionary histories of pairs of entities such as genes and species, hosts and parasites, and species and their geographical habitats and also help to train the next generation of researchers.

Genes and species interact through complex evolutionary processes such as gene duplication, horizontal gene transfer, and gene loss. Parasites and their hosts coevolve through processes including both contemporaneous and independent speciation, host switch, and extinction. And, species and their geographical habitats interact over geological time scales through vicariance, sympatric speciation, dispersal, and extinction. Consequently, the phylogenetic trees for genes and species, parasites and hosts, and species and their geographic regions are inherently incongruent. Phylogenetic tree reconciliation seeks to reconstruct the evolutionary histories of pairs of related entities by positing the evolutionary events that explain their incongruence. Traditional maximum parsimony methods for tree reconciliation require the user to select costs for each type of evolutionary event. These cost parameters are notoriously difficult to estimate and their values can substantially affect the results and conclusions. This approach uses a Pareto-optimal methodology that does not require the user to select event costs a priori, thereby providing a systematic view of the set of all possible optimal reconciliations. This work will ultimately result in software tools with applications across the life sciences including genomics, parasitology, virology, and biogeography. In addition, this project will involve a substantial number of undergraduates, thereby helping prepare the next generation of researchers in computational biology.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1419739
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2014-08-01
Budget End
2018-07-31
Support Year
Fiscal Year
2014
Total Cost
$331,274
Indirect Cost
Name
Harvey Mudd College
Department
Type
DUNS #
City
Claremont
State
CA
Country
United States
Zip Code
91711