Statistical Methods for Single-Cell RNA-Seq

Wu, Hao

Abstract

Single-cell RNA-sequencing (scRNA-seq) has emerged very recently as a powerful technology to investigate transcriptomic variation and regulation at the individual cell level. Traditional bulk RNA-seq pools RNA from a large number of cells and measures the averaged expressions in a sample. In contrast, scRNA-seq reveals cell to cell heterogeneity, providing critical information to the understanding of biological processes in development, differentiation, and disease etiologies. This new technology leads to an expansion of applications in both basic and clinical research, but also brings challenges in analysis with its unique data characteristics. These include: 1) difficulty in estimating molecule counts with the presence of technical artifacts, due to small amount of starting material and additional sample preparation procedures; 2) lack of appropriate methods for functional clustering for single cell RNA count data, which are much sparser than bulk RNA-seq; 3) lack of a quantitative measure and comparison of heterogeneity. We propose to address these challenges by developing a series of novel statistical methods for scRNA-seq data preprocessing and analyses. This includes removing technical bias in RNA capture and amplification to obtain accurate molecule level counts, identifying functional types/ subtypes of cells and interpretable feature groups, explaining heterogeneities between samples and cells, and identifying differential heterogeneity. All methods developed in this project will be implemented and released as free, open source software to benefit the genomics research community. The probability model and statistical framework established in this proposal will lay a foundation for future methodology development for other single cell sequencing experiments such as single-cell ATAC-seq or BS-seq.

Public Health Relevance

The regulation of gene expression plays a vital role in human health. Single-cell RNA-sequencing (scRNA-seq) is a new technology to characterize expression variation at individual cell level. It presents a promising direction to a better understanding of disease etiology, and leads to new drug targets and strategies for personalized treatment. This project will produce novel statistical methods for scRNA-seq data preprocessing and analyses which will enable more efficient and accurate analysis of scRNA-seq data.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM122083-03
Application #: 9532910
Study Section: Special Emphasis Panel (ZGM1)
Program Officer: Brazhnik, Paul

Project Start: 2016-08-15
Project End: 2021-07-31
Budget Start: 2018-08-01
Budget End: 2019-07-31
Support Year: 3
Fiscal Year: 2018
Total Cost
Indirect Cost

Institution

Name: Emory University
Department: Biostatistics & Other Math Sci
Type: Schools of Public Health
DUNS #: 066469933

City: Atlanta
State: GA
Country: United States
Zip Code: 30322

Related projects


NIH 2020 R01 GM	Statistical Methods for Single-Cell RNA-Seq Wu, Hao / Emory University
NIH 2019 R01 GM	Statistical Methods for Single-Cell RNA-Seq Wu, Hao / Emory University
NIH 2018 R01 GM	Statistical Methods for Single-Cell RNA-Seq Wu, Hao / Emory University
NIH 2017 R01 GM	Statistical Methods for Single-Cell RNA-Seq Wu, Hao / Emory University
NIH 2016 R01 GM	Statistical Methods for Single-Cell RNA-Seq Wu, Hao / Emory University	$357,190

Publications

Yao, Bing; Li, Yujing; Wang, Zhiqin et al. (2018) Active N6-Methyladenine Demethylation by DMAD Regulates Gene Expression by Coordinating with Polycomb Protein in Neurons. Mol Cell 71:848-857.e6

Cheng, Ying; Li, Ziyi; Manupipatpong, Sasicha et al. (2018) 5-Hydroxymethylcytosine alterations in the human postmortem brains of autism spectrum disorder. Hum Mol Genet 27:2955-2964

Xu, Tianlei; Zheng, Xiaoqi; Li, Ben et al. (2018) A comprehensive review of computational prediction of genome-wide features. Brief Bioinform :

Feng, Hao; Jin, Peng; Wu, Hao (2018) Disease prediction by cell-free DNA methylation. Brief Bioinform :

Wu, Zhijin; Zhang, Yi; Stitzel, Michael L et al. (2018) Two-phase differential expression analysis for single cell RNA-seq. Bioinformatics 34:3340-3348

Zhang, Feiran; Kang, Yunhee; Wang, Mengli et al. (2018) Fragile X mental retardation protein modulates the stability of its m6A-marked messenger RNA targets. Hum Mol Genet 27:3936-3950

Zhang, Weiwei; Feng, Hao; Wu, Hao et al. (2017) Accounting for tumor purity improves cancer subtype classification from DNA methylation data. Bioinformatics 33:2651-2657

Zheng, Xiaoqi; Zhang, Naiqian; Wu, Hua-Jun et al. (2017) Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome Biol 18:17

Wang, Yikai; Wu, Hao; Yu, Tianwei (2017) Differential gene network analysis from single cell RNA-seq. J Genet Genomics 44:331-334

Hong, Chuan; Ning, Yang; Wang, Shuang et al. (2017) PLEMT: A NOVEL PSEUDOLIKELIHOOD BASED EM TEST FOR HOMOGENEITY IN GENERALIZED EXPONENTIAL TILT MIXTURE MODELS. J Am Stat Assoc 112:1393-1404

Showing the most recent 10 out of 11 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: