The role of mRNA modifications as a regulatory process that affects gene expression at multiple levels is not well studied or understood. Current experimental tools for determining RNA modifications are laborious, noisy, and often do not provide exact locations of modified bases. Sequencing using Oxford Nanopore technology offers multiple advantages over Illumina sequencing including long reads and the ability to directly sequence RNA without the need for amplification, leading to reduced bias in coverage and the potential ability to uncover modified bases. The potential for discovering modified bases is still unfulfilled due to the lack of tools for this task. This project seeks to to make it significantly easier to identify RNA modifications globally and to help uncover the biological roles of the over 150 different types of RNA modifications. The challenge in the proposed research is that of the relatively small number of known modified bases, requiring clever design of sufficiently large labeled datasets, and necessitating the use of deep learning training algorithms that can succeed despite the relatively smaller datasets. The project draws upon recent developments in deep learning for tasks with few available labeled training examples to develop novel ways in which deep learning architectures for base calling can be applied to calling of modified RNA bases.

The proposed work will be transformative for research into RNA modifications and will enable the use of nanopore sequencing as a one-stop-shop for this purpose. Furthermore, it has the potential of leading to improved methods for the detection of targets of RNA-binding proteins, as several novel methods for this task are based on detecting modified RNA bases. Oxford Nanopore does not release the code for their production base calling software as open-source, limiting the ability of the research community to extend their methods to handle modifications. The tools designed as part of this work will provide a flexible open-source alternative, enabling progress on base calling of nanopore data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
1949036
Program Officer
Jean Gao
Project Start
Project End
Budget Start
2020-04-01
Budget End
2022-03-31
Support Year
Fiscal Year
2019
Total Cost
$300,000
Indirect Cost
Name
Colorado State University-Fort Collins
Department
Type
DUNS #
City
Fort Collins
State
CO
Country
United States
Zip Code
80523