Bayesian Methods for Epistasis Association Mapping

Zhang, Yu

Abstract

Genome-wide case-control association studies hold great promises to identify the disease related genes and unveil their underlying complex regulatory mechanisms. For human common diseases, the disease variants are often non-Mendelian: they have low penetrance and show little effects to the carrier s disease susceptibility when being assessed individually, but they may interact with others in complex ways. Identifying multi-locus interactions (epistasis) associations within the human genome is, however, computationally and statistically very challenging. Recent development in statistical methods, such as the stepwise-logistic regression (Marchini et al. 2005) and the BEAM algorithm (Zhang and Liu, 2007), has demonstrated that genome-wide epistasis association mapping is not only feasible, but also can be more fruitful than traditional approaches that exclusively focus on marginal effects. In this proposal, we propose to further improve the BEAM algorithm to explore the LD structures and haplotypes inherited in the human genome to greatly advance our capability in detecting subtle disease associations and interactions. Various haplotype-based association methods have been developed in the past decades, yet there is no consensus on the best approach. We will develop a flexible Bayesian framework for testing both marginal and interaction associations using haplotypes. In particular, all possible haplotype combinations and their interactions will be efficiently explored via Monte Carlo Markov chain (MCMC) algorithms. In addition, we will treat markers that are not genotyped in an association study as the missing data. By iteratively imputing the missing markers and testing their associations, we will be able to identify a few disease associated markers (which may include the unobserved ones) that can explain the observed genetic difference between the patients and the normal people. In addition, unmeasured population structures in a case-control sample will induce long-range correlation between SNPs that may be falsely reported as interactions. It is urgently needed to further improve the efficiency and the accuracy of existing stratification detection algorithms. We propose to develop efficient Bayesian methods to identify population structures presented in the case-control sample. We further propose novel statistical models to adjust for the detected population effects. The software will be written in C++ for both Unix/Linux and Windows systems and freely available to the community.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 5R01HG004718-02
Application #: 7674011
Study Section: Special Emphasis Panel (ZRG1-GGG-A (52))
Program Officer: Brooks, Lisa

Project Start: 2008-08-15
Project End: 2011-06-30
Budget Start: 2009-07-01
Budget End: 2010-06-30
Support Year: 2
Fiscal Year: 2009
Total Cost: $141,721
Indirect Cost

Institution

Name: Pennsylvania State University
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 003403953

City: University Park
State: PA
Country: United States
Zip Code: 16802

Related projects


NIH 2014 R01 HG	New Bayesian algorithms for genome-wide association mapping Zhang, Yu / Pennsylvania State University	$183,472
NIH 2013 R01 HG	New Bayesian algorithms for genome-wide association mapping Zhang, Yu / Pennsylvania State University	$178,195
NIH 2012 R01 HG	New Bayesian algorithms for genome-wide association mapping Zhang, Yu / Pennsylvania State University	$178,651
NIH 2010 R01 HG	Bayesian Methods for Epistasis Association Mapping Zhang, Yu / Pennsylvania State University	$140,118
NIH 2009 R01 HG	Bayesian Methods for Epistasis Association Mapping Zhang, Yu / Pennsylvania State University	$141,721
NIH 2008 R01 HG	Bayesian Methods for Epistasis Association Mapping Zhang, Yu / Pennsylvania State University	$141,468

Publications

Zhang, Yu; Tian, Lifeng; Sleiman, Patrick et al. (2018) Bayesian analysis of genome-wide inflammatory bowel disease data sets reveals new risk loci. Eur J Hum Genet 26:265-274

Zhang, Yu; An, Lin; Yue, Feng et al. (2016) Jointly characterizing epigenetic dynamics across multiple human cell types. Nucleic Acids Res 44:6721-31

Lee, Yeonok; Ghosh, Debashis; Zhang, Yu (2014) Regression hidden Markov modeling reveals heterogeneous gene expression regulation: a case study in mouse embryonic stem cells. BMC Genomics 15:360

Lee, Yeonok; Ghosh, Debashis; Hardison, Ross C et al. (2014) MRHMMs: multivariate regression hidden Markov models and the variantS. Bioinformatics 30:1755-6

Chen, Kuan-Bei; Hardison, Ross; Zhang, Yu (2014) dCaP: detecting differential binding events in multiple conditions and proteins. BMC Genomics 15 Suppl 9:S12

Zhang, Yu; Ghosh, Soumitra; Hakonarson, Hakon (2014) Dynamic Bayesian testing of sets of variants in complex diseases. Genetics 198:867-78

Zhang, Yu (2013) De novo inference of stratification and local admixture in sequencing studies. BMC Bioinformatics 14 Suppl 5:S17

Lee, Yeonok; Ghosh, Debashis; Zhang, Yu (2013) Association testing to detect gene-gene interactions on sex chromosomes in trio data. Front Genet 4:239

Zhang, Yu (2013) A dynamic Bayesian Markov model for phasing and characterizing haplotypes in next-generation sequencing. Bioinformatics 29:878-85

Xu, Jialin; Zhang, Yu (2012) A generalized linear model for peak calling in ChIP-Seq data. J Comput Biol 19:826-38

Showing the most recent 10 out of 18 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: