It is proposed to carry out a series of computer simulations testing methods for inferring genealogies and phylogenies from molecular sequences, and methods for placing confidence intervals on those genealogies and phylogenies (henceforth called genealogies). In addition to evaluating existing methods and testing their adequacy, some new likelihood-based methods will be developed. These will be made available to researchers by adding them to the existing PHYLIP Phylogeny Inference Package, which will continue to be distributed for free. Genealogies and phylogenies are branching patterns of descent. With molecular sequence samples from populations they are central to interpreting the patterns of sequence diversity, but little is known of the statistical properties of methods for inferring them. The proposed study involves simulating genealogies by random branching, recording these true genealogies and then simulating evolution of nucleic acid sequences along them. Inequalities of rates of evolution among lineages, among sites, and transitions and transversions will be simulated. A variety of methods of estimating genealogies, including parsimony, distance matrix, and likelihood methods will be compared for accuracy. The problem of inconsistency (convergence to the wrong tree) of parsimony methods will be explored. The accuracy of confidence intervals computed from likelihood ratio methods, bootstrap methods, and a pairwise tree test due to Templeton and Kishino and Hasegawa will be explored. Some new computer programs will be produced and distributed for doing maximum likelihood estimation in the presence of a molecular clock, for interactive exploration of likelihood surfaces, and for estimating the position of the root in trees of RNA sequences by a method due to Carl Woese.
Felsenstein, J; Churchill, G A (1996) A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol 13:93-104 |
Thorne, J L; Kishino, H (1992) Freeing phylogenies from artifacts of alignment. Mol Biol Evol 9:1148-62 |
Felsenstein, J (1992) Estimating effective population size from samples of sequences: inefficiency of pairwise and segregating sites as compared to phylogenetic estimates. Genet Res 59:139-47 |
Thorne, J L; Kishino, H; Felsenstein, J (1992) Inching toward reality: an improved likelihood model of sequence evolution. J Mol Evol 34:3-16 |
Felsenstein, J (1992) Estimating effective population size from samples of sequences: a bootstrap Monte Carlo integration method. Genet Res 60:209-20 |
Felsenstein, J (1991) Counting phylogenetic invariants in some simple cases. J Theor Biol 152:357-76 |
Thorne, J L; Kishino, H; Felsenstein, J (1991) An evolutionary model for maximum likelihood alignment of DNA sequences. J Mol Evol 33:114-24 |