This proposal will continue the development of methods for making likelihood inferences from population samples of molecular sequences, where there may be recombination within the sequences and natural selection affecting some sites. Molecular and population biologists have continued to collect increasing amounts of such data; as in other areas, likelihood methods will become important in their analysis. Population biologists want to use the sequences to understand population history and evolutionary forces, while molecular biologists will want to use within-species variation to understand which regions of a molecule are constrained from varying. It is proposed to develop statistical methods for approximately computing the likelihoods of population parameters, such as effective population sizes and population growth rates; genetic parameters such as the rate of recombination; and patterns of natural selection such as balancing selection or directional selection acting at particular sites. The likelihoods can be computed if we can sum them over all the possible recombining genealogies connecting the members of an observed sample. While there are far too many genealogies to do the sum exactly, Markov Chain Monte Carlo methods such as the Metropolis-Hastings method can be used to draw a large enough random sample of genealogies, and use these to estimate the likelihood curves. Previous work has developed algorithms for recombining sequences; the present proposal will new methods for calculating the likelihoods in the presence of natural selection at specific sites. It will also integrate the existing methods into a usable whole which will allow biologists to construct an analysis of the particular combination of evolutionary forces that they want to consider. The methods are computer intensive; they will be made available, free, over the Internet, as the LAMARC package of programs distributed in C source code and as executables.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM051929-06
Application #
6138497
Study Section
Genetics Study Section (GEN)
Program Officer
Eckstrand, Irene A
Project Start
1995-01-01
Project End
2002-12-31
Budget Start
2000-01-01
Budget End
2000-12-31
Support Year
6
Fiscal Year
2000
Total Cost
$262,690
Indirect Cost
Name
University of Washington
Department
Genetics
Type
Schools of Arts and Sciences
DUNS #
135646524
City
Seattle
State
WA
Country
United States
Zip Code
98195
McGill, James R; Walkup, Elizabeth A; Kuhner, Mary K (2013) Correcting coalescent analyses for panel-based SNP ascertainment. Genetics 193:1185-96
Kuhner, Mary K (2009) Coalescent genealogy samplers: windows into population history. Trends Ecol Evol 24:86-93
Smith, Lucian P; Kuhner, Mary K (2009) The limits of fine-scale mapping. Genet Epidemiol 33:344-56
Kuhner, Mary K; Smith, Lucian P (2007) Comparing likelihood and Bayesian coalescent estimation of population parameters. Genetics 175:155-65
Kuhner, Mary K (2006) Robustness of coalescent estimators to between-lineage mutation rate variation. Mol Biol Evol 23:2355-60
Kuhner, Mary K (2006) LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics 22:768-70
Felsenstein, Joseph (2006) Accuracy of coalescent likelihood estimates: do we need more sites, more sequences, or more loci? Mol Biol Evol 23:691-700
Felsenstein, Joseph (2005) Using the quantitative genetic threshold model for inferences between and within species. Philos Trans R Soc Lond B Biol Sci 360:1427-34
Beerli, Peter (2004) Effect of unsampled populations on the estimation of population sizes and migration rates between sampled populations. Mol Ecol 13:827-36
Felsenstein, J (2001) Taking variation of evolutionary rates between sites into account in inferring phylogenies. J Mol Evol 53:447-55

Showing the most recent 10 out of 16 publications