Single nucleotide polymorphism data (SNP - 'snip') are quickly becoming popular for addressing an array of problems in human genetics and in population and evolutionary biology in general. Modern SNP discovery techniques make it feasible to survey multiple populations and numerous individuals using thousands of these SNPs distributed over the entire genome. These large SNP datasets are not only important in human genetics, but will increase the precision of many population genetic or evolutionary studies in other species. However, the analysis of SNP data is not straightforward. For instance, it is necessary to take into account the condition that every SNP is variable; the appropriate ascertainment correction will depend on the nature of the SNP discovery process. Another difficulty is that each SNP contributes only a minute amount of population genetic or phylogenetic information. It is only when information can be efficiently extracted from large SNP datasets that SNPs become more informative about the population or species history than traditional sequence data. Coalescent-based approaches that take into account genealogical relationships among sampled individuals suffer from these restrictions and currently none of the current implementations work well with thousands of independent SNP loci. Our proposed research will focus on three areas of analytical improvements that will facilitate the analysis of SNP data: (1) allowing for biased genealogy ascertainment by finding mathematical formulae enabling us to collapse large genealogies into simpler ones where all sub-trees with tips having the same SNP allele, occurring in the same population, or belonging to the same selection class are combined; (2) improving the precision of the correction for ascertainment bias by exploring different correction schemes; and (3) increasing model accuracy by accommodating different substitution patterns in exons and introns. Collapsing trees into smaller constructs will be particularly important in reducing the computational burden associated with the analysis of large SNP data sets. These analytical improvements will be implemented in an computer program, which will allow researchers who work on difficult problems in human genetics, human ancestry, and other fields to analyze large numbers of SNP loci in a reasonable amount of time. Single nucleotide polymorphism (SNP) data are already abundant, but current programs are not able to handle this flood of data. We will develop algorithms that allows collapsing the large genealogies of sampled individuals into smaller constructs and that account for biased sampling of SNPs. These methods will reduce the computational burden and will thus enable researchers to work on complicated population models using a large numbers of SNP loci in a reasonable amount of time.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM078985-03
Application #
7414574
Study Section
Special Emphasis Panel (ZGM1-CBCB-5 (BM))
Program Officer
Eckstrand, Irene A
Project Start
2006-05-01
Project End
2011-04-30
Budget Start
2008-05-01
Budget End
2011-04-30
Support Year
3
Fiscal Year
2008
Total Cost
$138,504
Indirect Cost
Name
Florida State University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
790877419
City
Tallahassee
State
FL
Country
United States
Zip Code
32306
Hotz, Hansjürg; Beerli, Peter; Uzzell, Thomas et al. (2013) Balancing a cline by influx of migrants: a genetic transition in water frogs of eastern Greece. J Hered 104:57-71
Bradic, Martina; Beerli, Peter; García-de León, Francisco J et al. (2012) Gene flow and population structure in the Mexican blind cavefish complex (Astyanax mexicanus). BMC Evol Biol 12:9
Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J et al. (2012) BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. Syst Biol 61:170-3
Bedford, Trevor; Cobey, Sarah; Beerli, Peter et al. (2010) Global migration dynamics underlie evolution and persistence of human influenza A (H3N2). PLoS Pathog 6:e1000918
Beerli, Peter; Palczewski, Michal (2010) Unified framework to evaluate panmixia and migration direction among multiple sampling locations. Genetics 185:313-26
Ak?n, Ci?dem; Bilgin, C Can; Beerli, Peter et al. (2010) Phylogeographic patterns of genetic diversity in eastern Mediterranean water frogs have been determined by geological processes and climate change in the Late Cenozoic. J Biogeogr 37:2111-2124
Plötner, Jörg; Köhler, Frank; Uzzell, Thomas et al. (2009) Evolution of serum albumin intron-1 is shaped by a 5' truncated non-long terminal repeat retrotransposon in western Palearctic water frogs (Neobatrachia). Mol Phylogenet Evol 53:784-91
Gonzalez, Elena G; Beerli, Peter; Zardoya, Rafael (2008) Genetic structuring and migration patterns of Atlantic bigeye tuna, Thunnus obesus (Lowe, 1839). BMC Evol Biol 8:252
Douhan, G W; Smith, M E; Huyrn, K L et al. (2008) Multigene analysis suggests ecological speciation in the fungal pathogen Claviceps purpurea. Mol Ecol 17:2276-86
Plotner, J; Uzzell, T; Beerli, P et al. (2008) Widespread unidirectional transfer of mitochondrial DNA: a case in western Palaearctic water frogs. J Evol Biol 21:668-81

Showing the most recent 10 out of 11 publications