This project aims to develop informatic tools for single-nucleotide analysis of cancer RNA sequencing (RNA-Seq) data. RNA-Seq is becoming an essential tool in both basic and clinical cancer research. As a result, numerous research groups are generating their own RNA-Seq data sets. In addition, large consortium efforts, such as the TCGA project, are producing an extraordinary amount of RNA-Seq data that are invaluable resources to the research community. The wide adoption of RNA-Seq calls for effective and user-friendly informatic tools that can extract information of important biological relevance. To meet this need, great effort has been dedicated to tool development, resulting in products that are now widely in use, such as short read aligners, transcriptome assembly tools, and methods to detect differential gene expression and alternative splicing. However, a major advantage of RNA-Seq is its capacity to provide information at the single-nucleotide level. Tools that harness this information are relatively scarce. As a result, single-nucleotide analysis is not yet a routine procedure in RNA- Seq informatics. This type of analysis can potentially reveal important biological insights. With sequencing errors excluded, single-nucleotide variants (SNVs) expressed in the RNA reflect existence of genetic variants or RNA editing sites, both of which could be essential players in cancer diagnostics, basic mechanisms and biology. A number of challenges exist in the identification, quantification and functional prediction of these SNVs. We have developed a suite of methodologies to address these challenges. We will further improve these methods and build user-friendly informatic tools and web portals to identify and analyze SNVs in cancer RNA-Seq data. These tools will facilitate a broad spectrum of SNV analysis, ranging from raw read mapping to functional prediction of SNVs in affecting alternative splicing or RNA stability. With no additional experimental cost, information of SNVs is readily extractable in all RNA-Seq data sets. A full exploration of this information could provide novel insights and maximize the scientific value of the still costly RNA-Seq data. Our project will develop tools that will enable incorporation of SNV analysis into routine procedures of RNA-Seq informatics.

Public Health Relevance

RNA sequencing (RNA-Seq) is becoming a widely used technology in basic and translational cancer research. The proposed projects aim to develop a suite of informatic tools to allow identification and functional prediction of single-nucleotide variants in RNA-Seq data. This work will provide tools that complement existing ones and greatly facilitate incorporation of single- nucleotide analyses into routine procedures of RNA-Seq informatics. Such analyses will provide mechanistic basis for how single-nucleotide variants in the genome and transcriptome may contribute to diseases, such that future interventions can target specific genes therapeutically.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project--Cooperative Agreements (U01)
Project #
5U01CA204695-02
Application #
9260868
Study Section
Special Emphasis Panel (ZCA1)
Program Officer
Li, Jerry
Project Start
2016-04-12
Project End
2019-03-31
Budget Start
2017-04-01
Budget End
2018-03-31
Support Year
2
Fiscal Year
2017
Total Cost
Indirect Cost
Name
University of California Los Angeles
Department
Physiology
Type
Schools of Arts and Sciences
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Brümmer, Anneke; Yang, Yun; Chan, Tracey W et al. (2017) Structure-mediated modulation of mRNA abundance by A-to-I editing. Nat Commun 8:1255