Biologists are rapidly adopting RNA-Sequencing (RNA-Seq) to study transcriptomes for basic understanding of cellular functions and to address important needs in such areas as food production, food security, pharmacy, human health, disease treatment, and disease prevention. Statistical tools for complete and specific analysis of RNA-Seq data, however, have been slow to emerge, and the use of off-the-shelf tools developed for other applications has the strong potential to produce misleading conclusions. Germane to the goals of this proposal, sophisticated methods for assessing differential gene expression from RNA-Seq- based on a negative binomial (NB) exact test for two-group comparisons-have not yet been extended to regression analysis. Such methods are required for assessing differential gene expression after accounting for covariates, for analyzing the dependence of expression on explanatory variables, and for studying interactive effects on expression of multiple factors. The objectives of this proposal are to address this need in the following ways: 1) develop, assess, and implement higher-order asymptotic (HOA) adjustments to likelihood ratio inference for NB regression analysis of RNA-Seq data, including the preparation of a publicly-available R package for complete regression analysis of RNA-Seq data, and the inclusion of the inferential computations in an already publicly available, Perl-based computational pipeline for complete analysis of RNA-Seq data;2) clarify the power of optimal inference for RNA-Seq studies and provide a computer program for assessing sample size needs;and 3) develop an interactive, dynamic visualization program for conveying RNA-Seq data, NB regression model results, and associated uncertainties. The methods used include the application of higher-order asymptotic theory, Monte Carlo simulation, the development of Level of Detail (LOD) """"""""focus plus context"""""""" visualization methods, and serious attention to real RNA-Seq datasets.

Public Health Relevance

The tools developed from the work proposed herein have direct relevance to human health because RNASeq-based transcriptome profiling has broad applications in nearly all areas of biological inquiry, including human health, disease treatment, and disease prevention.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM104977-02
Application #
8501581
Study Section
Special Emphasis Panel (ZGM1-CBCB-5 (BM))
Program Officer
Bender, Michael T
Project Start
2012-07-01
Project End
2015-04-30
Budget Start
2013-05-01
Budget End
2014-04-30
Support Year
2
Fiscal Year
2013
Total Cost
$190,161
Indirect Cost
$48,637
Name
Oregon State University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
053599908
City
Corvallis
State
OR
Country
United States
Zip Code
97339
Beaver, Laura M; Kuintzle, Rachael; Buchanan, Alex et al. (2017) Long noncoding RNAs and sulforaphane: a target for chemoprevention and suppression of prostate cancer. J Nutr Biochem 42:72-83
Quandt, C Alisha; Di, Yanming; Elser, Justin et al. (2016) Differential Expression of Genes Involved in Host Recognition, Attachment, and Degradation in the Mycoparasite Tolypocladium ophioglossoides. G3 (Bethesda) 6:731-41
Araújo, Welington L; Creason, Allison L; Mano, Emy T et al. (2016) Genome Sequencing and Transposon Mutagenesis of Burkholderia seminalis TC3.4.2R3 Identify Genes Contributing to Suppression of Orchid Necrosis Caused by B. gladioli. Mol Plant Microbe Interact 29:435-46
Mi, Gu; Di, Yanming; Schafer, Daniel W (2015) Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data. PLoS One 10:e0119254
Mi, Gu; Di, Yanming (2015) The level of residual dispersion variation and the power of differential expression tests for RNA-Seq data. PLoS One 10:e0120117
Di, Yanming (2015) Single-gene negative binomial regression models for RNA-Seq data with higher-order asymptotic inference. Stat Interface 8:405-418
Goyer, Aymeric; Hamlin, Launa; Crosslin, James M et al. (2015) RNA-Seq analysis of resistant and susceptible potato varieties during the early stages of potato virus Y infection. BMC Genomics 16:472
Burkhardt, Alyssa; Buchanan, Alex; Cumbie, Jason S et al. (2015) Alternative Splicing in the Obligate Biotrophic Oomycete Pathogen Pseudoperonospora cubensis. Mol Plant Microbe Interact 28:298-309
Chang, Jeff H; Desveaux, Darrell; Creason, Allison L (2014) The ABCs and 123s of bacterial secretion systems in plant pathogenesis. Annu Rev Phytopathol 52:317-45
Beaver, Laura M; Buchanan, Alex; Sokolowski, Elizabeth I et al. (2014) Transcriptome analysis reveals a dynamic and differential transcriptional response to sulforaphane in normal and prostate cancer cells and suggests a role for Sp1 in chemoprevention. Mol Nutr Food Res 58:2001-13

Showing the most recent 10 out of 15 publications