The objective of this research proposal is to develop an entirely new approach to the analysis and summary of genome association data. In contrast to approaches that use asymptotic parametric results, or computationally intensive resampling, our approach uses exact permutation moments followed by a density approximation to the relevant statistics. The new approach will be far faster and provide more accurate values than current methods. We will develop these procedures into a new software package PANGEA. PANGEA will be especially useful for next-generation sequence data, and generally for even bigger-data future applications in genomics. The proposal is divided into three Aims: (i) To develop powerful and accurate testing procedures for genetic association studies of SNPs/variants, applicable both to SNP array and NGS platforms and with flexible handling of families and effective covariate control; (ii) To develop fast and accurate empirical pathway analysis approaches for genetic association; (iii) To provide an efficient and user-friendly software, further informed by comprehensive eQTL and ENCODE genomic annotation.

Public Health Relevance

This proposed research will result in theory, methods, and software to use mathematical approximations to permutation testing. The work will greatly accelerate and enhance the conduct of genome association studies. This work will improve public health by enabling faster and more accurate identification of the genetic underpinnings of complex disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21HG007840-01A1
Application #
8893207
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
2015-06-16
Project End
2017-04-30
Budget Start
2015-06-16
Budget End
2016-04-30
Support Year
1
Fiscal Year
2015
Total Cost
$227,250
Indirect Cost
$77,250
Name
North Carolina State University Raleigh
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
042092122
City
Raleigh
State
NC
Country
United States
Zip Code
27695
Palowitch, John; Shabalin, Andrey; Zhou, Yi-Hui et al. (2018) Estimation of cis-eQTL effect sizes using a log of linear model. Biometrics 74:616-625
Zhou, Yi-Hui; Marron, James S; Wright, Fred A (2018) Computation of ancestry scores with mixed families and unrelated individuals. Biometrics 74:155-164
Polineni, Deepika; Dang, Hong; Gallins, Paul J et al. (2018) Airway Mucosal Host Defense Is Key to Genomic Regulation of Cystic Fibrosis Lung Disease Severity. Am J Respir Crit Care Med 197:79-93
Zhou, Yi-Hui; Marron, J S; Wright, Fred A (2018) Eigenvalue significance testing for genetic association. Biometrics 74:439-447
Hu, Tao; Gallins, Paul; Zhou, Yi-Hui (2018) A Zero-inflated Beta-binomial Model for Microbiome Data Analysis. Stat (Int Stat Inst) 7:
Aoshima, Makoto; Shen, Dan; Shen, Haipeng et al. (2018) A survey of high dimension low sample size asymptotics. Aust N Z J Stat 60:4-19
Zhou, Yi-Hui; Brooks, Paul; Wang, Xiaoshan (2018) A two-stage hidden Markov model design for biomarker detection, with application to microbiome research. Stat Biosci 10:41-58
Rudra, Pratyaydipta; Zhou, Yihui; Wright, Fred A (2017) A procedure to detect general association based on concentration of ranks. Stat (Int Stat Inst) 6:88-101
Zhou, Yi-Hui (2016) Pathway analysis for RNA-Seq data using a score-based approach. Biometrics 72:165-74
Corvol, Harriet; Blackman, Scott M; Boƫlle, Pierre-Yves et al. (2015) Genome-wide association meta-analysis identifies five modifier loci of lung disease severity in cystic fibrosis. Nat Commun 6:8382