It is proposed to investigate the phylogenetic reconstruction problem from a probabilistic perspective. Inferring the speciation history of a group of organisms is a fundamental problem in evolutionary biology. This history is represented by a phylogeny, i.e., a rooted tree where the leaves correspond to current species and branchings indicate speciation events. The stochastic evolution of molecular sequences on such a phylogeny is an instance of a Markov model on a tree. In this project, the PI will further develop connections between the theory of Gibbs measures on trees and the phylogenetic reconstruction problem. A particular emphasis will be given to models of insertions and deletions with the objective of providing a probabilistic analysis of the multiple sequence alignment problem. Connections to information theory problems will also be considered.
Assembling the Tree of Life is a fundamental problem in biology which provides insights in the study of evolution, adaptation, and speciation. Much information about past evolutionary events can be inferred from the analysis of DNA sequence data collected from existing species. A notable feature of the evolution of molecular sequences is the significant role played by randomness. In recent years, probability theory, the mathematical study of randomness, has provided key new insights in assessing the power of statistical methods to reconstruct evolutionary processes in large-scale phylogenetics. The main theme of this project is to further investigate these connections. In particular, more realistic models of evolution will be considered. New phylogenetic analysis techniques will be developed and implemented.