Genome-wide association studies (GWAS) have been successful in identifying genetic variants affecting the risk of common diseases. GWAS have identified thousands of associated variants, and in some instances the underlying causal variants have been fine-mapped, providing key biological insights. Most studies have been conducted in populations of European ancestry, but many studies now include multi-ethnic samples. Analysis of multi-ethnic data presents many advantages, including increased power to detect associated variants that are rare or absent in Europeans and increased resolution for fine-mapping, but also many challenges. The extent to which genetic architectures are shared across ethnicities is not well-understood, the implications for meta-analyzing studies across ethnicities are uncertain, and the optimal strategy for performing fine-mapping in multi-ethnic data remains an open question, particularly when allowing for multiple causal variants at a locus. These challenges can inhibit multi-ethnic study designs, limiting opportunities to detect new associations and address health disparities in minority populations. Here, we propose to develop a complete set of methods and software for disease mapping in multi-ethnic populations, building on the extensive progress of our research program over the past four years. Our goal is to make fully powered association and fine-mapping studies as practical in multi-ethnic populations as in studies of a single continental population. Our methods research will be driven by empirical data from >900,000 samples (>700,000 with raw genotypes/phenotypes and >200,000 with summary statistics), including African American, Latino, East Asian and South Asian samples spanning a wide range of quantitative and disease phenotypes. We will develop methods for both raw genotype/phenotype data and summary association statistic data, and the methods will be applicable to both common and rare variation, including gene-based tests.

Public Health Relevance

Genome-wide association studies (GWAS), an approach in which the genomes of both diseased and healthy individuals are scanned to identify genes affecting disease risk, have thus far been conducted mostly in populations of European ancestry, but many studies now include samples from multiple ethnicities. The inclusion of multiple ethnicities offers great promise for identifying genes that could not be detected by analyzing data only from Europeans, but existing statistical methods for analyzing multi-ethnic data are inadequate due to the complexities posed by combining data across ethnicities. In this proposal, we will use empirical genetic data sets to develop statistical methods and software to fill this gap.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Hindorff, Lucia
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
Public Health & Prev Medicine
Schools of Public Health
United States
Zip Code
Giambartolomei, Claudia; Zhenli Liu, Jimmy; Zhang, Wen et al. (2018) A Bayesian framework for multiple trait colocalization from summary association statistics. Bioinformatics 34:2538-2545
Johnson, Ruth; Shi, Huwenbo; Pasaniuc, Bogdan et al. (2018) A unifying framework for joint trait analysis under a non-infinitesimal model. Bioinformatics 34:i195-i201
Freund, Malika Kumar; Burch, Kathryn S; Shi, Huwenbo et al. (2018) Phenotype-Specific Enrichment of Mendelian Disorder Genes near GWAS Regions across 62 Complex Traits. Am J Hum Genet 103:535-552
Franceschini, Nora; Giambartolomei, Claudia; de Vries, Paul S et al. (2018) GWAS and colocalization analyses implicate carotid intima-media thickness and carotid plaque loci in cardiovascular outcomes. Nat Commun 9:5141
Galinsky, Kevin J; Reshef, Yakir A; Finucane, Hilary K et al. (2018) Estimating cross-population genetic correlations of causal effect sizes. Genet Epidemiol :
Roytman, Megan; Kichaev, Gleb; Gusev, Alexander et al. (2018) Methods for fine-mapping with chromatin and expression data. PLoS Genet 14:e1007240
Loh, Po-Ru; Kichaev, Gleb; Gazal, Steven et al. (2018) Mixed-model association for biobank-scale datasets. Nat Genet 50:906-908
Mak, Angel C Y; White, Marquitta J; Eckalbar, Walter L et al. (2018) Whole-Genome Sequencing of Pharmacogenetic Drug Response in Racially Diverse Children with Asthma. Am J Respir Crit Care Med 197:1552-1564
Mancuso, Nicholas; Gayther, Simon; Gusev, Alexander et al. (2018) Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat Commun 9:4079
Palamara, Pier Francesco; Terhorst, Jonathan; Song, Yun S et al. (2018) High-throughput inference of pairwise coalescence times identifies signals of selection and enriched disease heritability. Nat Genet 50:1311-1317

Showing the most recent 10 out of 75 publications