Imputation and Analysis of Rare Variants in Admixed Populations

Li, Yun

Abstract

Project Description: Genomewide association studies (GWAS) have identified >4000 genetic loci for a wide range of human traits, but still leaving a large proportion of heritability unexplained. In the post-GWAS era, geneticists are exploiting massively parallel sequencing technologies to study less common (minor allele frequency [MAF] 0.5- 5%) and rare (MAF<0.5%) variants, hereafter together referred to as rare variants for brevity. In the meantime, multiethnic GWAS, recognized as potentially more powerful for gene discovery and fine mapping, are receiving increasing attention from the genetics community. Among the multiethnic populations, admixed populations such as African Americans and Hispanic Americans are particularly attractive because they comprise more than 20% of the US population. These admixed populations offer a unique opportunity for gene mapping because one can utilize admixture linkage disequilibrium (LD) to search for genes underlying diseases that differ strikingly in prevalences across populations. However, little methodological work exists for admixed populations that can accommodate post-GWAS data. The methodological work lags in at least three major areas. First, there are few, if any, genotype imputation methods that are tailored to admixed samples, can accommodate the ever increasing public resources, and the typical mixture of genotyping and sequencing data among the study samples. Imputation will continue to play an essential role as sequencing will remain cost prohibitive for large GWAS collections of samples. Second, there has been no published work on practical issues regarding rare variant imputation in admixed populations. Third, despite the recent rich literature of statistical methods for rare variant association analysis in relatively homogenous populations, the field needs methods that can efficiently analyze rare variants in admixed samples, particularly with imputed or partially imputed data. In this application, we propose the following aims to fill in the above gaps: 1). Develop efficient hidden Markov model and Singular Value Decomposition based methods for haplotype-to-haplotype imputation in admixed populations;2). Assess quality of and provide practical guidelines on rare variants imputation in admixed populations;3). Develop a robust statistical test for the analysis of rare variants in admixed populations;and 4). Develop, distribute and support freely available software packages for the methods developed in this project.

Public Health Relevance

Genomewide association studies (GWAS) have identified >4000 genetic loci for a wide range of human traits, but still leaving a large proportion of heritability unexplained. In the post-GWAS era, geneticists are exploiting massively parallel sequencing technologies to study less common (minor allele frequency [MAF] 0.5- 5%) and rare (MAF<0.5%) variants, hereafter together referred to as rare variants for brevity. In the meantime, multiethnic GWAS, recognized as potentially more powerful for gene discovery and fine mapping, are receiving increasing attention from the genetics community. Among the multiethnic populations, admixed populations such as African Americans and Hispanic Americans are particularly attractive because they comprise more than 20% of the US population. These admixed populations offer a unique opportunity for gene mapping because one can utilize admixture linkage disequilibrium (LD) to search for genes underlying diseases that differ strikingly in prevalences across populations. However, little methodological work exists for admixed populations that can accommodate post-GWAS data. In this application, we will fill in methodological and practical gaps in the genetic analysis of rare variants in admixed populations

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 5R01HG006703-03
Application #: 8634810
Study Section: Special Emphasis Panel (ZRG1-GGG-C (02))
Program Officer: Brooks, Lisa

Project Start: 2012-05-16
Project End: 2015-02-28
Budget Start: 2014-03-01
Budget End: 2015-02-28
Support Year: 3
Fiscal Year: 2014
Total Cost: $308,717
Indirect Cost: $85,855

Institution

Name: University of North Carolina Chapel Hill
Department: Genetics
Type: Schools of Medicine
DUNS #: 608195277

City: Chapel Hill
State: NC
Country: United States
Zip Code: 27599

Related projects


NIH 2014 R01 HG	Imputation and Analysis of Rare Variants in Admixed Populations Li, Yun / University of North Carolina Chapel Hill	$308,717
NIH 2013 R01 HG	Imputation and Analysis of Rare Variants in Admixed Populations Li, Yun / University of North Carolina Chapel Hill	$302,119
NIH 2012 R01 HG	Imputation and Analysis of Rare Variants in Admixed Populations Li, Yun / University of North Carolina Chapel Hill	$320,000

Publications

Duan, Qing; Xu, Zheng; Raffield, Laura M et al. (2018) A robust and powerful two-step testing procedure for local ancestry adjusted allelic association analysis in admixed populations. Genet Epidemiol 42:288-302

Luo, Yiwen; Maity, Arnab; Wu, Michael C et al. (2018) On the substructure controls in rare variant analysis: Principal components or variance components? Genet Epidemiol 42:276-287

Ju, Chelsea J-T; Zhao, Zhuangtian; Wang, Wei (2017) Efficient Approach to Correct Read Alignment for Pseudogene Abundance Estimates. IEEE/ACM Trans Comput Biol Bioinform 14:522-533

Hui, Daniel; Fang, Zhou; Lin, Jerome et al. (2017) LAIT: a local ancestry inference toolkit. BMC Genet 18:83

Raffield, Laura M; Zakai, Neil A; Duan, Qing et al. (2017) D-Dimer in African Americans: Whole Genome Sequence Analysis and Relationship to Cardiovascular Disease Risk in the Jackson Heart Study. Arterioscler Thromb Vasc Biol 37:2220-2227

Martin, Joshua S; Xu, Zheng; Reiner, Alex P et al. (2017) HUGIn: Hi-C Unifying Genomic Interrogator. Bioinformatics 33:3793-3795

Cannon, Maren E; Duan, Qing; Wu, Ying et al. (2017) Trans-ancestry Fine Mapping and Molecular Assays Identify Regulatory Variants at the ANGPTL8 HDL-C GWAS Locus. G3 (Bethesda) 7:3217-3227

Xu, Zheng; Zhang, Guosheng; Jin, Fulai et al. (2016) A hidden Markov random field-based Bayesian method for the detection of long-range chromosomal interactions in Hi-C data. Bioinformatics 32:650-6

Xu, Zheng; Zhang, Guosheng; Duan, Qing et al. (2016) HiView: an integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants. BMC Res Notes 9:159

Naik, Rakhi P; Wilson, James G; Ekunwe, Lynette et al. (2016) Elevated D-dimer levels in African Americans with sickle cell trait. Blood 127:2261-3

Showing the most recent 10 out of 42 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: