About 2% of the human genome encodes proteins, and the vast majority of cellular RNA is non-coding (ncRNA). Mounting evidence indicate that ncRNA could fold into complex structures, and play critical roles in cellular physiology and diseases. After depletion of the abundant ncRNA such as transfer RNA and ribosomal RNA, many cellular ncRNA was found to be 3? modified with a polyA tail (pA ncRNA), similar to messenger RNA. However, a significant portion of cellular ncRNA was found to lack the polyA tail (non-pA ncRNA) or considered to be bimorphic (exist in both pA and non-pA form). The sequence, structure and biological function of non-pA ncRNA and bimorphic ncRNA remain largely unknown, and technical limitations are the main roadblocks for scientists to explore this largely uncharted fraction of the human transcriptome. In this proposal, Dr.
Shaw aims to develop a novel nanopore sequencing technique that enables the direct sequencing and secondary structure detection of full-length non-pA ncRNA. This technique will utilize novel bacterial and eukaryotic reverse transcriptases (RTs) to capture native non-pA ncRNA, and discreetly thread the captured RNA through a Mycobacterium smegmatis porin A (MspA) nanopore sequencer for concurrent RNA sequence and RNA secondary structure detection. The development and application of this technique will further extend the current RNA sequencing tool box and bring scientists one step closer to fully understanding the function of the human transcriptome.
The specific aims of this proposal are:
(Aim 1) Dr. Shaw will establish the experimental framework necessary for the single molecule characterization of RTs, nanopore sequencing of RNA, and the computational tools needed to accurately transform nanopore ion current to RNA sequence.
(Aim 2) Dr. Shaw will perform screening of bacterial and eukaryotic RTs in search for the most robust RT to ratchet RNA through nanopore with minimal ratcheting defects and optimal sensitivity to RNA secondary structures.
(Aim 3) In his independent research phase, Dr. Shaw will first validate the robustness of his novel technique by sequencing RNA mixtures with well-defined sequences and secondary structures. He will then apply his technique to profile the sequence and structure of non-pA ncRNA extracted from HeLa-S3 cell line. Finally, he will further adapt his technique to be compatible with chemical methods that can directly probe native RNA secondary structures in vivo, such as dimethyl sulfate- sequencing. During the K99 career development stage, Dr. Shaw will conduct research under the mentorship and support from Dr. Carlos Bustamante (single molecule studies of molecular motors), Dr. Susan Marqusee (single molecule biophysics), Dr. Kathleen Collins (reverse transcriptase engineering), and Dr. Jens Gundlach (nanopore sequencing). This multidisciplinary group of experienced advisors and the outstanding scientific milieu of UC Berkeley will provide Dr. Shaw with the comprehensive training needed to achieve all aims of the proposal, and to establish his independent research career as principle investigator.

Public Health Relevance

A significant fraction of human cellular RNA plays critical roles in various cellular machinery other than being a direct messenger for protein production. Despite evidences suggesting the biological significance of these ?non-coding? RNA in cells, our understanding of their biogenesis, sequence diversity, structure, and function is limited. The technique developed in this proposed project will allow us to directly profile the sequence and structure of non-coding RNA, and will open up new avenues of genome research where concurrent profiling of RNA sequence, structure, and function can be achieved.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Career Transition Award (K99)
Project #
Application #
Study Section
National Human Genome Research Institute Initial Review Group (GNOM)
Program Officer
Smith, Michael
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Berkeley
Schools of Arts and Sciences
United States
Zip Code