Understanding evolutionary relationships among a collection of species is one of the most fundamental objectives in evolutionary biology. Advances in DNA sequencing technology have resulted in a shift toward the use of multiple genes to infer such phylogenetic histories. With this shift has come the realization that each gene in the genome has its own history, which may differ from the overall evolutionary history of the species. In this project, two processes known to cause discord between gene and species histories will be modeled statistically. The first, termed "incomplete lineage sorting," leads to substantial differences in the evolution of individual genes and occurs in virtually all organisms. The second, hybridization - the mating of individuals from two distinct species with the production of viable offspring - is believed to occur for as many as 25% of plants and 10% of animals. This project will develop statistical methods that integrate both of these processes simultaneously to reconstruct the history of evolutionary relationships among organisms.
Development of statistically-sound phylogenetic models incorporating both processes will allow two fundamental biological issues to be addressed: (1) estimation of the extent of historical hybridization between pairs of species; and (2) inference of the true phylogenetic history when gene-by-gene variation due to both processes is taken into account. The result will be more accurate understanding of the evolutionary relationships among present-day species. In addition, the investigator is involved in an NSF-funded program in mathematical biology for undergraduates, and thus undergraduate researchers will participate in some aspects of this research.