Unified Statistical Methods for Sequence-based Association Studies. Fast and economic next generation sequencing (NGS) technologies will generate unprecedentedly massive (thousands of individuals) and high-dimensional (ten millions) genomic and epigenomic variation data that allow nearly complete evaluation of genomic and epigenomic variation including common and rare variants, RNA-seq, mRNA-seq and methylation-seq data. As a consequence, these genomic variation data are so densely distributed across the genome that the genetic variants can be considered as genomic variation observations varying over a continuum. The emergence of NGS technologies is not only changing our view of genomics from independently segregating discrete model to hybrid (both discrete and continuous) models, but also causing great changing in analytic methods for genomic and epigenomic analysis from standard multivariate data analysis to functional data analysis, from independent sampling to dependent sampling, from low dimensional data analysis to high dimensional data analysis, from single genomic or epigenomic variant analysis to integrated genomic and epigenomic analysis. To address the great challenges we are facing in NGS data analysis, the goals of this proposal are to develop novel and powerful statistical methods for sequence-based association studies and QTL (eQTL) analysis which leverage high dimensional data reduction, causal inference and functional data analysis techniques to identify both common and rare risk variants across the genome, investigate their function via intermediate phenotypes and expressions, estimate the total effects (intervention effects) and direct effects of variants on the phenotypes, and unify family and population-based designs. We will evaluate the performance of these methods by simulated and real datasets.

Public Health Relevance

This application is to employ high dimensional data reduction and functional data analysis techniques to develop and test innovative genetic models, statistical methods and computational algorithms for sequence-based association studies and QTL analysis, and unify family and population-based designs using various types of family and unrelated individual data sampled from any population structures.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM104411-03
Application #
8788936
Study Section
Special Emphasis Panel (ZGM1-GDB-7 (CP))
Program Officer
Krasnewich, Donna M
Project Start
2013-04-01
Project End
2017-01-31
Budget Start
2015-02-01
Budget End
2016-01-31
Support Year
3
Fiscal Year
2015
Total Cost
$309,450
Indirect Cost
$81,667
Name
University of Texas Health Science Center Houston
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
800771594
City
Houston
State
TX
Country
United States
Zip Code
77225
Zhao, Jinying; Zhu, Yun; Xiong, Momiao (2016) Genome-wide gene-gene interaction analysis for next-generation sequencing. Eur J Hum Genet 24:421-8
Fan, Ruzong; Wang, Yifan; Boehnke, Michael et al. (2015) Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models. Genetics 200:1089-104
Jiang, Junhai; Lin, Nan; Guo, Shicheng et al. (2015) Multiple functional linear model for association analysis of RNA-seq with imaging. Quant Biol 3:90-102
Wang, Yifan; Liu, Aiyi; Mills, James L et al. (2015) Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models. Genet Epidemiol 39:259-75
Huang, Jinyan; Chen, Jun; Esparza, Jorge et al. (2015) eQTL mapping identifies insertion- and deletion-specific eQTLs in multiple tissues. Nat Commun 6:6821
Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric et al. (2015) Pathway analysis with next-generation sequencing data. Eur J Hum Genet 23:507-15
Dong, Chengliang; Wei, Peng; Jian, Xueqiu et al. (2015) Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. Hum Mol Genet 24:2125-37
Guo, Shicheng; Yan, Fengyang; Xu, Jibin et al. (2015) Identification and validation of the methylation biomarkers of non-small cell lung cancer (NSCLC). Clin Epigenetics 7:3
Tang, Hongwei; Wei, Peng; Duell, Eric J et al. (2014) Genes-environment interactions in obesity- and diabetes-associated pancreatic cancer: a GWAS data analysis. Cancer Epidemiol Biomarkers Prev 23:98-106
Guo, Shicheng; Wang, Yu-Long; Li, Yi et al. (2014) Significant SNPs have limited prediction ability for thyroid cancer. Cancer Med 3:731-5

Showing the most recent 10 out of 31 publications