In this proposal, incongruence in gene trees and species trees is examined using data from multiple genes in the context of the coalescent process. First, the investigator will show that one currently advocated approach for the analysis of data from multiple genes, the concatenation approach, can be statistically inconsistent, even when a consistent method of phylogenetic tree estimation is used. Second, an algorithm for maximum likelihood (ML) estimation of species trees from data on multiple genes under the coalescent model will be developed and implemented, and will be made freely available via the internet. The availability of a method for ML species tree estimation will allow for likelihood-based hypothesis testing of phylogeographic and population genetic hypotheses. Further, methods for assessing uncertainty in the species tree estimates will be developed by extending traditional bootstrapping methods in phylogenetics to the case in which data have been collected for multiple genes sampled randomly throughout the genome. Finally, tests for the adequacy of the coalescent model will be developed by examining whether the observed gene trees are consistent with a given species tree using several metrics to measure levels of incongruence.

The inference of the evolutionary history of a collection of organisms based on the information contained in their DNA sequences is a problem of fundamental importance in evolutionary biology. The abundance of DNA sequence data arising from genome sequencing projects has led to significant challenges in the inference of these phylogenetic relationships. Among these challenges is the inference of the evolutionary history of a collection of species based on DNA sequence information from several distinct genes sampled throughout the genome. This project studies the effect of the coalescent process on the inference of species phylogenies using data from multiple genes. This work will first demonstrate that failure to model the coalescent process can lead to incorrect inferences of species relationships. The investigator will then develop methods that can accurately estimate species phylogenies through explicitly modeling the coalescent process, and will apply these estimation procedures to construct techniques for hypothesis testing and for measuring uncertainty in estimated species phylogenies.

--

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0505265
Program Officer
Grace Yang
Project Start
Project End
Budget Start
2005-08-01
Budget End
2007-02-28
Support Year
Fiscal Year
2005
Total Cost
$74,999
Indirect Cost
Name
University of New Mexico
Department
Type
DUNS #
City
Albuquerque
State
NM
Country
United States
Zip Code
87131