Statistical Methods for RNA-seq Data Analysis

Sun, Wei

Abstract

Gene expression data produced from expression microarrays have not only greatly improved our understanding of cell biology, but also provided invaluable resources to guide the diagnosis and treatment of human diseases. However, the pace of incorporating gene expression signatures into medical practice has been relatively slow. This is mainly due to the limitations of gene expression microarrays and the natural variation of gene expression across tissues or developmental stages. This research project aims to overcome these limitations by joint study of germline DNA polymorphisms and allele-specific expression (ASE) obtained from RNA-seq data. Since germline DNA polymorphisms are stable across tissues and developmental stages, inclusion of DNA information will help us establish more reliable biomarkers for patients' clinical care. More specifically, we will study the genetic basis of ASE in both normal and tumor tissues, dissect genetic and parent-of-origin effects on ASE in human cell lines, and identify genes that escape X inactivation in both mouse reciprocal cross and human cell lines.

Public Health Relevance

We propose to develop statistical methods and software for RNA-seq data analysis, with specific aims on dissecting the genetic basis of allele-specific expression (ASE), quantitative assessment of autosomal imprinting in humans, as well as the genetically controlled measurement of escape from X-inactivation in mouse and human. The deliverables of this project will help biomedical researchers to harvest the huge amount of knowledge accumulated in DNA variations and RNA-seq data and translate them into strategies of personalized disease prevention and treatment.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM105785-05
Application #: 9267991
Study Section: Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer: Krasnewich, Donna M

Project Start: 2014-05-15
Project End: 2019-04-30
Budget Start: 2017-05-01
Budget End: 2019-04-30
Support Year: 5
Fiscal Year: 2017
Total Cost
Indirect Cost

Institution

Name: Fred Hutchinson Cancer Research Center
Department
Type
DUNS #: 078200995

City: Seattle
State: WA
Country: United States
Zip Code: 98109

Related projects


NIH 2021 R01 GM	Statistical Methods for RNA-seq Data Analysis Sun, Wei / Fred Hutchinson Cancer Research Center
NIH 2020 R01 GM	Statistical Methods for RNA-seq Data Analysis Sun, Wei / Fred Hutchinson Cancer Research Center
NIH 2019 R01 GM	Statistical Methods for RNA-seq Data Analysis Sun, Wei / Fred Hutchinson Cancer Research Center
NIH 2017 R01 GM	Statistical Methods for RNA-seq Data Analysis Sun, Wei / Fred Hutchinson Cancer Research Center
NIH 2016 R01 GM	Statistical Methods for RNA-seq Data Analysis Sun, Wei / Fred Hutchinson Cancer Research Center
NIH 2015 R01 GM	Statistical Methods for RNA-seq Data Analysis Sun, Wei / University of North Carolina Chapel Hill	$365,881
NIH 2015 R01 GM	Statistical Methods for RNA-seq Data Analysis Sun, Wei / Fred Hutchinson Cancer Research Center	$254,443
NIH 2014 R01 GM	Statistical Methods for RNA-seq Data Analysis Sun, Wei / University of North Carolina Chapel Hill

Publications

He, Qianchuan; Liu, Yang; Sun, Wei (2018) Statistical analysis of non-coding RNA data. Cancer Lett 417:161-167

Liu, Yang; He, Qianchan; Sun, Wei (2018) Association analysis using somatic mutations. PLoS Genet 14:e1007746

Kirk, Jessime M; Kim, Susan O; Inoue, Kaoru et al. (2018) Functional classification of long non-coding RNAs by k-mer content. Nat Genet 50:1474-1482

Liu, Yanyan; Xiong, Sican; Sun, Wei et al. (2018) Joint Analysis of Strain and Parent-of-Origin Effects for Recombinant Inbred Intercrosses Generated from Multiparent Populations with the Collaborative Cross as an Example. G3 (Bethesda) 8:599-605

Sun, Wei; Bunn, Paul; Jin, Chong et al. (2018) The association between copy number aberration, DNA methylation and gene expression in tumor samples. Nucleic Acids Res 46:3009-3018

Chen, Ting-Huei; Sun, Wei (2017) Prediction of cancer drug sensitivity using high-dimensional omic features. Biostatistics 18:1-14

Zhang, Yiwen; Zhou, Hua; Zhou, Jin et al. (2017) Regression Models For Multivariate Count Data. J Comput Graph Stat 26:1-13

Zhou, Hua; Blangero, John; Dyer, Thomas D et al. (2017) Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data. Genet Epidemiol 41:174-186

Hu, Yi-Juan; Liao, Peizhou; Johnston, H Richard et al. (2016) Testing Rare-Variant Association without Calling Genotypes Allows for Systematic Differences in Sequencing between Cases and Controls. PLoS Genet 12:e1006040

Rashid, Naim U; Sun, Wei; Ibrahim, Joseph G (2016) A STATISTICAL MODEL TO ASSESS (ALLELE-SPECIFIC) ASSOCIATIONS BETWEEN GENE EXPRESSION AND EPIGENETIC FEATURES USING SEQUENCING DATA. Ann Appl Stat 10:2254-2273

Showing the most recent 10 out of 22 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: