Analysis of Genomic Data for Complex Traits

Zhang, Heping

Abstract

Many health conditions, including substance use and mental illnesses, are complex and depend on both genetic and environmental factors. In the past several years genome wide association studies (GWA) have identified single-nucleotide polymorphisms implicating hundreds of robustly replicated loci for common traits. Despite numerous successes, it remains persistently difficult to identify genes, environmental factors, and interactions among them for complex diseases. This has been referred to as the geneticist's nightmare. Most of the identified variants have low associated risks and account for little heritability, and there is an increasing attention to find the """"""""missing heritability"""""""" of complex diseases. To this end, it is important to develop novel statistical methods. Our Preliminary Progress demonstrates that our proposed methods have already produced significant findings on the association between genes, environments, and complex traits. Several genetic variants that we identified by our novel methods will be cataloged by National Human Genome Research Institute. This project will take advantage of the PI's many years of experience in the data collection and analysis of GWA studies and build on his success in the development of statistical methods and software for genetic studies. The primary aim of this application is to continue our effort and success in developing, evaluating, and applying new statistical models, methods, and software to conduct GWA analyses of complex diseases.
Our specific aims are as follows: (A1) to develop statistical methods to perform inference for multidimensional and multi-modal traits. New methods will be developed to find the hidden heritability by incorporating multiple variants;simultaneously considering genetics and environment, and modeling multiple and heterogeneous traits;(A2) to develop tree- and forest-based methods for association analyses by incorporating multiple genetic variants, covariates, and gene-covariate interactions and incorporating existing biological information;(A.3) to develop and release software for public use through the PI's website. While the methods and software are developed, they will be applied to a variety of real studies that will serve as motivation and validation of our methods and software. In this regard, our secondary aims are to (B1) identify genes and environmental factors for addiction, mental illnesses, and the co-morbidity of psychiatric disorders;and (B2) identify genetic variants and environmental factors for preterm deliveries. In short, the objective of this project is significant, the foundation of our approach has been tested, and the new development will be novel and useful. The PI has decades of experience related to this project and leads a research center with well-established infrastructure and supporting personnel and students.

Public Health Relevance

Despite great advances in technology and methodology that have led to recent successes in identifying genetic variants for complex diseases, developments of novel statistical methods are critically important in dealing with difficulties inherent in geneic studies of complex phenotypes. This project will have a significant impact on analysis of genetic data and hence on public health, because our methods and software can help investigators understand genetic and environmental factors of common and complex diseases including substance use, cancer, and preterm birth.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Drug Abuse (NIDA)
Type: Research Project (R01)
Project #: 5R01DA016750-11
Application #: 8637949
Study Section: Cardiovascular and Sleep Epidemiology (CASE)
Program Officer: Pollock, Jonathan D

Project Start: 2003-07-01
Project End: 2017-03-31
Budget Start: 2014-04-01
Budget End: 2015-03-31
Support Year: 11
Fiscal Year: 2014
Total Cost: $291,188
Indirect Cost: $111,188

Institution

Name: Yale University
Department: Public Health & Prev Medicine
Type: Schools of Medicine
DUNS #: 043207562

City: New Haven
State: CT
Country: United States
Zip Code: 06520

Related projects

Publications

Pan, Wenliang; Tian, Yuan; Wang, Xueqin et al. (2018) BALL DIVERGENCE: NONPARAMETRIC TWO SAMPLE TEST. Ann Stat 46:1109-1137

You, Na; He, Shun; Wang, Xueqin et al. (2018) Subtype classification and heterogeneous prognosis model construction in precision medicine. Biometrics 74:814-822

Liu, Dungang; Zhang, Heping (2018) Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach. J Am Stat Assoc 113:845-854

Guo, Xiaobo; Zhu, Junxian; Fan, Qiao et al. (2018) A univariate perspective of multivariate genome-wide association analysis. Genet Epidemiol 42:470-479

Wen, Canhong; Mehta, Chintan M; Tan, Haizhu et al. (2018) Whole genome association study of brain-wide imaging phenotypes: A study of the ping cohort. Genet Epidemiol 42:265-275

Mehta, Chintan M; Gruen, Jeffrey R; Zhang, Heping (2017) A method for integrating neuroimaging into genetic models of learning performance. Genet Epidemiol 41:4-17

Xiao, Feifei; Niu, Yue; Hao, Ning et al. (2017) modSaRa: a computationally efficient R package for CNV identification. Bioinformatics 33:2384-2385

Bi, Xuan; Yang, Liuqing; Li, Tengfei et al. (2017) Genome-wide mediation analysis of psychiatric and cognitive traits through imaging phenotypes. Hum Brain Mapp 38:4088-4097

Song, Chi; Min, Xiaoyi; Zhang, Heping (2016) THE SCREENING AND RANKING ALGORITHM FOR CHANGE-POINTS DETECTION IN MULTIPLE SAMPLES. Ann Appl Stat 10:2102-2129

Cao, Taoyun; Wang, Xueqin; Zhang, Heping (2016) Energy bagging tree. Stat Interface 9:171-181

Showing the most recent 10 out of 94 publications

Comments

Be the first to comment on Heping Zhang's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: