Aberrant RNA modifications, especially methylations and pseudouridinylations, have been correlated to major diseases like breast cancer, type-2 diabetes, and obesity, each of which affects millions of Americans. Despite their significance, the available tools to reliably identify, locate, and quantify RNA modifications are very limited. As a result, we only know the function of a few modifications in contrast to the more than 100 RNA modifications that have been identified. Mass spectrometry (MS) is an essential tool for studying protein modifications, where peptide fragmentation produces ?ladders? that reveal the identity and position of modifications. However, a similar approach is not yet feasible for RNA as in situ fragmentation techniques that provide satisfactory sequence coverage do not exist. One way to circumvent this issue is to perform prior chemical degradation so that well- defined mass ladders can be formed before entering the spectrometer. However, the structural uniformity of ladder sequences generated by the prerequisite RNA degradation is unsatisfactory, complicating downstream data analysis. We have spearheaded the development of a two-dimensional LC/MS-based de novo RNA sequencing tool by taking advantage of predictable regularities in LC separation of optimized RNA digests to greatly simplify the interpretation of complex MS data. This method can simultaneously sequence up to three distinct RNAs of up to 30 nucleotides, as well as identify, locate, and quantify a broad spectrum of modifications in the RNA sample. We hypothesize that this MS-based RNA sequencing method could be further optimized to become a robust, easy-to-use, and broadly-applicable de novo sequencing approach, and that such a platform would be a highly useful and innovative tool that can complement existing next-generation RNA sequencing protocols for in-depth functional study of chemical modifications carried by endogenous RNAs. In this application, we propose to (a) reduce the RNA loading amount to a minimum threshold at which de novo sequencing of endogenous RNAs becomes practicable (Aim 1), (b) develop a streamlined data analysis/sequencing generation algorithm that will enhance the robustness of our sequencing method (Aim 2), and (c) provide proof-of-concept examples of the method?s usage in de novo sequencing of endogenous RNA samples (Aim 3). The proposed work is significant because it will bring the power of MS-based laddering technology to RNA, thus providing a method comparable to analysis of peptide modifications in proteomics that can reveal the identity and position of various RNA modifications. This project is highly innovative as successful accomplishment of the proposed work will 1) allow the MS-based platform to routinely sequence cellular RNA automatically and in a de novo fashion, 2) broaden its utility across a wide range of applications from research to biotech industries, and 3) eliminate the need for complementary DNA strand synthesis and permit the establishment of a complete, unambiguous spatiotemporal and quantitative profile for a wide variety of structural modifications in RNA samples.
Although more than 100 RNA modifications have been identified so far, we only know the function of a few of these, mainly due to technological limitations. The goal of the proposed research is to develop a robust and broadly-applicable RNA sequencing tool that will allow for accurate determination of the presence, type, and location of RNA modifications in samples of interest, as a critical step toward elucidating their possible impact on human health.