Genome-wide association studies (GWAS) have been successful in identifying common genetic variants contributing to disease risk. However, nearly all of these studies have been conducted in populations of European ancestry. It is important to include other populations, because GWAS in Europeans are unlikely to detect risk variants that are common only in non-European populations. In the United States, the majority of individuals of non-European ancestry belong to admixed populations (e.g. African Americans or Latinos) that inherit ancestry from more than one continental population. Existing GWAS methods for admixed populations are inadequate, because they do not incorporate both SNP association and admixture association signals. Thus, if existing methods are applied, analyses will not be fully powered and important variants will be missed. For diseases with known population differences-such as cardiovascular disease in African Americans and asthma in Latinos-the need to develop methods that combine these signals is particularly pressing, because admixture association signals are likely to be particularly important. Here, we propose to develop a complete set of methods and software to combine SNP and admixture association signals in GWAS in admixed populations, while addressing questions such as imputation and choosing SNPs for replication. Our goal is to make fully powered association studies in populations of mixed ancestry as practical as studies in populations of homogeneous ancestry. In addition to African Americans, we will also develop methods for complex admixed populations (e.g. Latinos) that inherit ancestry from three or more continental populations, and for related individuals from admixed populations. Our methods research will be driven by empirical data, including over 10,000 African American samples and 2,400 Latino samples that will be genome-scanned by the CARe consortium, the Jackson Heart Study, and the Multiethnic Cohort Study. Our work will be applicable not only to GWAS in admixed populations, but also to meta-analyses of European and admixed populations, as well as future resequencing-based studies.

Public Health Relevance

Genome-wide association studies (GWAS), an approach in which the genomes of both diseased and healthy individuals are scanned to identify genes affecting disease risk, have thus far been primarily restricted to populations of European ancestry. It is important to extend these studies to other populations, such as admixed populations (African Americans and Latinos) that inherit ancestry from multiple continental groups, but existing statistical methods for conducting GWAS in admixed populations are inadequate due to the complexities posed by chromosomal segments of distinct continental ancestry. In this proposal, we will use empirical genetic data sets to develop statistical methods and software to fill this gap.

National Institute of Health (NIH)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
Public Health & Prev Medicine
Schools of Public Health
United States
Zip Code
SIGMA Type 2 Diabetes Consortium; Williams, Amy L; Jacobs, Suzanne B R et al. (2014) Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico. Nature 506:97-101
Bhatia, Gaurav; Tandon, Arti; Patterson, Nick et al. (2014) Genome-wide scan of 29,141 African Americans finds no evidence of directional selection since admixture. Am J Hum Genet 95:437-44
Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo et al. (2014) Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics 30:2906-14
Yang, Jian; Zaitlen, Noah A; Goddard, Michael E et al. (2014) Advantages and pitfalls in the application of mixed-model association methods. Nat Genet 46:100-6
Zaitlen, Noah; Pasaniuc, Bogdan; Sankararaman, Sriram et al. (2014) Leveraging population admixture to characterize the heritability of complex traits. Nat Genet 46:1356-62
Chimusa, Emile R; Zaitlen, Noah; Daya, Michelle et al. (2014) Genome-wide association study of ancestry-specific TB risk in the South African Coloured population. Hum Mol Genet 23:796-809
Tucker, George; Price, Alkes L; Berger, Bonnie (2014) Improving the power of GWAS and avoiding confounding from population stratification with PC-Select. Genetics 197:1045-9
Chen, Chia-Yen; Pollack, Samuela; Hunter, David J et al. (2013) Improved ancestry inference using weights from external reference panels. Bioinformatics 29:1399-406
Bhatia, Gaurav; Patterson, Nick; Sankararaman, Sriram et al. (2013) Estimating and interpreting FST: the impact of rare variants. Genome Res 23:1514-21
Baran, Yael; Pasaniuc, Bogdan; Sankararaman, Sriram et al. (2012) Fast and accurate inference of local ancestry in Latino populations. Bioinformatics 28:1359-67

Showing the most recent 10 out of 11 publications