Statistical Methods to Map Genes for Complex Traits

Zhao, Hongyu

Abstract

Hundreds of genetic regions have been implicated in complex human traits in the past several years through the genome wide association study (GWAS) paradigm. Despite these successes, statistical analyses in most published work were based on single genetic markers. In addition, prior biological knowledge on genetic markers is rarely used. From both statistical and biological points of view, the rich information in the collected GWAS data has not been fully utilized to reveal disease etiologies. To address these critical needs, many research groups have been actively developing statistical and computational methods that can jointly analyze multiple markers, both within a region and across regions, and methods that can more effectively incorporate other sources of information on genetic markers, genes, and pathways in association analysis. The long- term goals of this application are to develop and implement novel statistical methods to identify genes affecting an individual's susceptibility to complex traits, to apply these methods to ongoing studies to enable more biological findings, and to disseminate these tools to the general research community. To achieve these broad goals, we propose to accomplish the following specific aims: (1) to develop statistical methods to identify markers that are informative about an individual's ancestry, and to take advantage of this information for more effective adjustment of sample heterogeneity in genetic association studies;(2) to develop statistical methods that can more efficiently perform multi-marker analysis, and to evaluate the statistical power of different marker search strategies;(3) to develop statistical methods that can systematically integrate different sources of information, especially biological pathways and networks, to increase our power to identify markers truly associated with complex diseases;(4) to develop statistical methods to use resequencing data to identify genetic associations between phenotypes and candidate regions. In addition, we will collaborate with leading human geneticists to apply and refine the statistical methods to a wide array of diseases, and to disseminate well-tested and validated programs to the scientific community.

Public Health Relevance

It is well known that genetics plays a major role in many complex human diseases, e.g. cancer, hypertension, and mental disorders. However, very few genes had been firmly implicated in these disorders until a few years ago. With the introduction of high-density platforms where hundreds of thousands of genetic variants can be monitored simultaneously and the formations of large collaborative projects where thousands of patients are jointly analyzed, the field of human genetics has enjoyed a revolution recently. Hundreds of genomic regions have been found to affect the risks of dozens of diseases, and this list will likely keep increasing in the foreseeable future. These rich data have generated many statistical challenges, especially with the rapid developments of resequencing technologies. This project will develop novel and powerful statistical methods to enable human geneticists to make the most out of the valuable data collected. Through extensive collaborations, our methods will be applied to many ongoing studies to identify more genomic regions and biological pathways for complex diseases. We will also distribute the well-tested computer programs so that other researchers can utilize the statistical tools developed by us.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM059507-14
Application #: 8434142
Study Section: Special Emphasis Panel (ZRG1-GGG-F (02))
Program Officer: Krasnewich, Donna M

Project Start: 1999-02-01
Project End: 2015-02-28
Budget Start: 2013-03-01
Budget End: 2015-02-28
Support Year: 14
Fiscal Year: 2013
Total Cost: $335,985
Indirect Cost: $132,973

Institution

Name: Yale University
Department: Public Health & Prev Medicine
Type: Schools of Medicine
DUNS #: 043207562

City: New Haven
State: CT
Country: United States
Zip Code: 06520

Related projects

Publications

Sun, Jiehuan; Herazo-Maya, Jose D; Huang, Xiu et al. (2018) Distance-correlation based gene set analysis in longitudinal studies. Stat Appl Genet Mol Biol 17:

Wang, Tao; Zhao, Hongyu (2017) A Dirichlet-tree multinomial regression model for associating dietary nutrients with gut microorganisms. Biometrics 73:792-801

Liu, Yiyi; Zhao, Hongyu (2017) Variable importance-weighted Random Forests. Quant Biol 5:338-351

Sun, Jiehuan; Herazo-Maya, Jose D; Kaminski, Naftali et al. (2017) A Dirichlet process mixture model for clustering longitudinal gene expression data. Stat Med 36:3495-3506

Yan, Xiting; Liang, Anqi; Gomez, Jose et al. (2017) A novel pathway-based distance score enhances assessment of disease heterogeneity in gene expression. BMC Bioinformatics 18:309

Chung, Dongjun; Kim, Hang J; Zhao, Hongyu (2017) graph-GPA: A graphical model for prioritizing GWAS results and investigating pleiotropic architecture. PLoS Comput Biol 13:e1005388

Hou, Lin; Sun, Ning; Mane, Shrikant et al. (2017) Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study. Genet Epidemiol 41:152-162

Lin, Zhixiang; Wang, Tao; Yang, Can et al. (2017) On joint estimation of Gaussian graphical models for spatial and temporal data. Biometrics 73:769-779

Sun, Jiehuan; Warren, Joshua L; Zhao, Hongyu (2017) A Bayesian semiparametric factor analysis model for subtype identification. Stat Appl Genet Mol Biol 16:145-158

Zhu, Ruoqing; Zhao, Ying-Qi; Chen, Guanhua et al. (2017) Greedy outcome weighted tree learning of optimal personalized treatment rules. Biometrics 73:391-400

Showing the most recent 10 out of 190 publications

Comments

Be the first to comment on Hongyu Zhao's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: