Most of the eukaryotic genome is transcribed, yielding a complex repertoire of transcripts that includes tens of thousands of individual noncoding RNAs with little or no predicted protein-coding capacity. Among these are well-studied small RNAs, such as microRNAs, as well as many other classes of small and long transcripts whose functions and mechanisms of biogenesis are less clear - but likely no less important. The MALAT1 locus is over-expressed in many human cancers and produces an abundant long nuclear-retained noncoding RNA. Despite being an RNA polymerase II transcript, we previously showed that the 3'end of MALAT1 is not produced by canonical cleavage/polyadenylation but instead by recognition and cleavage of a tRNA-like structure by RNase P. This results in the generation of a second noncoding RNA from the MALAT1 locus known as mascRNA that is tRNA-like and exported to the cytoplasm. mascRNA is significantly more evolutionarily conserved than the long MALAT1 transcript;however, the function of mascRNA and its role in cancer initiation/progression have not been explored.
In Specific Aim 1, I will use a newly developed expression plasmid that recapitulates MALAT1 3'end processing to efficiently overexpress mascRNA in tissue culture cells. Changes in gene expression and cellular phenotype induced by modulating the expression of mascRNA will be identified, allowing paradigms for how tRNA-like small RNAs function in mammalian cells to be revealed.
In Specific Aim 2, I will characterize the molecular mechanisms by which the 3'end of the long MALAT1 transcript is stabilized despite the absence of a canonical poly(A) tail. These experiments will reveal new insights into how long transcripts not subjected to cleavage/polyadenylation are made resistant to degradation and function in gene expression. As there are very likely other noncoding RNAs besides MALAT1 that are processed at their 3'ends via non-canonical mechanisms, next-generation sequencing technology will be used in Specific Aim 3 to specifically identify the 3'ends of long poly(A) minus RNAs. Nearly all previous studies characterizing the transcriptome have used a poly(A) selection step to enrich for messenger RNAs and deplete abundant housekeeping RNAs, such as ribosomal RNAs. However, this step also removes all long RNAs that lack poly(A) tails and, therefore, most transcripts subjected to non-canonical 3'end processing mechanisms. By using a novel library construction method, the mature 3'ends of these previously "hidden" RNAs will be revealed and characterized, providing insights into unexpected regulatory mechanisms that may control RNA stability, localization, or translation efficiency. In the short term, this career development awar will allow me to greatly expand my research into new, previously unexplored areas during the K99 phase. The excellent training environment in the Sharp lab and MIT will greatly facilitate not only the mentored research but also endow me with all the necessary skills to transition to an independent academic faculty position. In the long term, I am confident that these experiments will provide a foundation on which my own independent research program can grow and flourish. In summary, by identifying the functional role of tRNA-like small RNAs as well as characterizing the mechanisms that generate and stabilize non-canonical 3'ends of long RNAs, these innovative studies will reveal key new insights into the regulation, functions, and processing of noncoding RNAs that are relevant in human cancer.
The MALAT1 locus is commonly misregulated in many human cancers. This locus generates two noncoding RNAs, although their cellular functions and the mechanisms by which they affect cancer initiation/progression are poorly understood. Therefore, characterizing the functions of these transcripts and the mechanisms by which they are regulated will both contribute to our understanding of the transcriptional output of the human genome as well as likely yield insights into how these transcripts affect human disease.