Genome-wide association studies (GWAS) have been successful in identifying common genetic variants contributing to disease risk. However, nearly all of these studies have been conducted in populations of European ancestry. It is important to include other populations, because GWAS in Europeans are unlikely to detect risk variants that are common only in non-European populations. In the United States, the majority of individuals of non-European ancestry belong to admixed populations (e.g. African Americans or Latinos) that inherit ancestry from more than one continental population. Existing GWAS methods for admixed populations are inadequate, because they do not incorporate both SNP association and admixture association signals. Thus, if existing methods are applied, analyses will not be fully powered and important variants will be missed. For diseases with known population differences-such as cardiovascular disease in African Americans and asthma in Latinos-the need to develop methods that combine these signals is particularly pressing, because admixture association signals are likely to be particularly important. Here, we propose to develop a complete set of methods and software to combine SNP and admixture association signals in GWAS in admixed populations, while addressing questions such as imputation and choosing SNPs for replication. Our goal is to make fully powered association studies in populations of mixed ancestry as practical as studies in populations of homogeneous ancestry. In addition to African Americans, we will also develop methods for complex admixed populations (e.g. Latinos) that inherit ancestry from three or more continental populations, and for related individuals from admixed populations. Our methods research will be driven by empirical data, including over 10,000 African American samples and 2,400 Latino samples that will be genome-scanned by the CARe consortium, the Jackson Heart Study, and the Multiethnic Cohort Study. Our work will be applicable not only to GWAS in admixed populations, but also to meta-analyses of European and admixed populations, as well as future resequencing-based studies.

Public Health Relevance

Genome-wide association studies (GWAS), an approach in which the genomes of both diseased and healthy individuals are scanned to identify genes affecting disease risk, have thus far been primarily restricted to populations of European ancestry. It is important to extend these studies to other populations, such as admixed populations (African Americans and Latinos) that inherit ancestry from multiple continental groups, but existing statistical methods for conducting GWAS in admixed populations are inadequate due to the complexities posed by chromosomal segments of distinct continental ancestry. In this proposal, we will use empirical genetic data sets to develop statistical methods and software to fill this gap.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
Public Health & Prev Medicine
Schools of Public Health
United States
Zip Code
Mathieson, Iain; Reich, David (2017) Differences in the rare variant spectrum among human populations. PLoS Genet 13:e1006581
Hayeck, Tristan J; Loh, Po-Ru; Pollack, Samuela et al. (2017) Mixed Model Association with Family-Biased Case-Control Ascertainment. Am J Hum Genet 100:31-39
Pasaniuc, Bogdan; Price, Alkes L (2017) Dissecting the genetics of complex traits using summary association statistics. Nat Rev Genet 18:117-127
Gymrek, Melissa; Willems, Thomas; Reich, David et al. (2017) Interpreting short tandem repeat variations in humans using mutational constraint. Nat Genet 49:1495-1501
Nakatsuka, Nathan; Moorjani, Priya; Rai, Niraj et al. (2017) The promise of discovering population-specific disease-associated genes in South Asia. Nat Genet 49:1403-1407
Loh, Po-Ru; Palamara, Pier Francesco; Price, Alkes L (2016) Fast and accurate long-range phasing in a UK Biobank cohort. Nat Genet 48:811-6
Galinsky, Kevin J; Loh, Po-Ru; Mallick, Swapan et al. (2016) Population Structure of UK Biobank and Ancient Eurasians Reveals Adaptation at Genes Influencing Blood Pressure. Am J Hum Genet 99:1130-1139
Mallick, Swapan; Li, Heng; Lipson, Mark et al. (2016) The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538:201-206
Gymrek, Melissa; Willems, Thomas; Guilmatre, Audrey et al. (2016) Abundant contribution of short tandem repeats to gene expression variation in humans. Nat Genet 48:22-9
Brown, Brielin C; Asian Genetic Epidemiology Network Type 2 Diabetes Consortium; Ye, Chun Jimmie et al. (2016) Transethnic Genetic-Correlation Estimates from Summary Statistics. Am J Hum Genet 99:76-88

Showing the most recent 10 out of 60 publications