The long-term objective of this project is to develop statistical and computational methods for the analysis of haplotypes in population genetics. With the availability of large numbers of genetic markers in the human genome and the advances in genotyping technology, it is becoming feasible in population genetic studies to genotype thousands of markers in a large number of individuals from multiple populations. The analysis of such data poses challenging statistical and computational issues and both theoretical and empirical studies are needed to develop and evaluate statistical methods that can best extract the most relevant information for statistical inference of parameters of interest.
The specific aims of this projects are: (1) Develop statistical and computational methods to infer haplotype frequencies from the observed unphased marker data in multiple populations; (2) Develop general guidelines for marker selection to identify disease susceptibility variants through haplotypes; (3) Use haplotypes consisting of single nucleotide polymorphisms as well as microsatellites from multiple populations for inference on population parameters as well as local recombination rates; (4) Investigate the power of statistical methods to identify chromosomal regions that have been subject to natural selections; (5) Implement and validate the developed methodologies in computer programs that will be distributed to the scientific community; and (6) Collaborate with other investigators to apply the methods and knowledge gained from this project to analyze data from other projects. Our methods will exploit two unique features in the data to be collected: the large number of populations around the world and the exhaustive cataloguing of haplotypes in extended chromosomal regions. The developments of these novel statistical methods and user-friendly computer programs will provide useful tools on population genetic studies and the analysis of data collected from other projects will lead to better understanding of relationships among various populations and different forces leading to linkage disequilibrium patterns in the human genome.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Yale University
New Haven
United States
Zip Code
Heffelfinger, Christopher; Pakstis, Andrew J; Speed, William C et al. (2014) Haplotype structure and positive selection at TLR1. Eur J Hum Genet 22:551-7
Murdoch, John D; Speed, William C; Pakstis, Andrew J et al. (2013) Worldwide population variation and haplotype analysis at the serotonin transporter gene SLC6A4 and implications for association studies. Biol Psychiatry 74:879-89
Donnelly, Michael P; Paschou, Peristera; Grigorenko, Elena et al. (2012) A global view of the OCA2-HERC2 region and pigmentation. Hum Genet 131:683-96
Reich, David; Patterson, Nick; Campbell, Desmond et al. (2012) Reconstructing Native American population history. Nature 488:370-4
Pakstis, Andrew J; Fang, Rixun; Furtado, Manohar R et al. (2012) Mini-haplotypes as lineage informative SNPs and ancestry inference SNPs. Eur J Hum Genet 20:1148-54
Nakagome, Shigeki; Mano, Shuhei; Kozlowski, Lukasz et al. (2012) Crohn's disease risk alleles on the NOD2 locus have been maintained by natural selection on standing variation. Mol Biol Evol 29:1569-85
Kidd, Judith R; Friedlaender, Fran├žoise; Pakstis, Andrew J et al. (2011) Single nucleotide polymorphisms and haplotypes in Native American populations. Am J Phys Anthropol 146:495-502
Li, Hui; Gu, Sheng; Han, Yi et al. (2011) Diversification of the ADH1B gene during expansion of modern humans. Ann Hum Genet 75:497-507
Palejev, Dean; Hwang, Wookyeon; Landi, Nicole et al. (2011) An application of the elastic net for an endophenotype analysis. Behav Genet 41:120-4
Godshalk, S E; Paranjape, T; Nallur, S et al. (2011) A Variant in a MicroRNA complementary site in the 3' UTR of the KIT oncogene increases risk of acral melanoma. Oncogene 30:1542-50

Showing the most recent 10 out of 73 publications