The completion of the Human Genome Project left an enduring mystery: Why do protein-coding genes account for only a tiny fraction of the genome sequence? What functional elements reside in the rest of the genome? Nearly a decade later, in 2012, the ENCODE (Encyclopedia of DNA Elements) Consortium, in which I participate, revealed the extraordinary abundance of long non-coding RNA (lncRNA) genes in the human genome (Derrien et al 2012). This achievement builds, in part, on my prior work (Jia et al 2010). In contrast to microRNAs, lncRNAs act through diverse and heterogeneous mechanisms, as both positive and negative regulators of gene expression. Among the properties of lncRNAs, their low interspecies conservation is particularly intriguing: nearly 5,000 human lncRNAs are not conserved beyond primates. This is interesting and important because, traditionally, protein-coding genes conserved in evolution were thought to be responsible for most functional outcomes in normal cellular processes and in disease. However, evolutionary lineage-specific biological and pathological responses and mechanisms are increasingly clear. Do non-conserved lncRNAs have lineage- and species-specific functions in human disease? To investigate this, I will examine the functions of evolutionarily non-conserved human lncRNAs in a biologically and clinically relevant system: human MCF7 cells, an established in-vitro model of estrogen receptor alpha positive breast cancer. To generate preliminary data, I utilized my custom human lncRNA microarray (Lipovich et al 2012) to interrogate lncRNAs for estrogen responsiveness in MCF7 cells. I identified 127 estrogen-responsive lncRNAs, analyzing 18 by RNAi and overexpression, followed by a panel of six phenotypic assays. Knockdown of estrogen- induced, and overexpression of estrogen-repressed, primate-specific lncRNAs reduced cell viability and proliferation, and in several cases caused cell death. This finding prompted my central hypothesis: certain primate-specific lncRNAs shift human cells along the apoptosis-proliferation axis. In this project, I will pursue this hypothesis and expand upon it. I ill extend the six functional assays to all 127 leads. I will employ second- and third-generation RNAseq, instead of microarrays, to impute the complete MCF7 estrogen- responsive lncRNAome. I will identify novel primate-specific lncRNAs from RNAseq data and subject them to my optimized workflow of system perturbations and phenotypic assays. Having shown ectopic translation of some lncRNAs within the framework of the ENCODE Consortium (B?nfai et al 2012), I will also test whether primate-specific functional lncRNAs act directly as RNAs, not via translated peptides. The results will approach a fascinating new question with the potential to initiate a paradigm shift in cancer biology: Is human cancer, to an extent, a primate-specific disease caused by non-conserved lncRNAs? Implementation of this proposal will result in the first ever conservation-unbiased, high-throughput assignment of cellular functions to primate-specific lncRNAs in a major nuclear hormone receptor pathway that is highly relevant to cancer therapeutics.
The discoveries of the post-genomic era have refined our understanding of the role of RNA in the biology of cells: RNA, rather than being almost exclusively a messenger in the flow of information from DNA to protein, is responsible for many essential functions in cells and organisms, directly and independently of protein-coding potential. Long non-protein-coding RNA (lncRNA), recently shown by my lab and others to be surprisingly abundant and multifunctional in human cells, is different from mRNA in a fundamental way: lncRNA genes often lack sequence conservation even between closely related species. Focusing on the novel idea that non-conserved lncRNAs can have essential cellular and disease functions, this New Innovator project will identify primate-specific lncRNAs that are functional in cell growth and cell death, within the framework of a major nuclear hormone receptor pathway in human breast cancer, and will catalog non-conserved RNA genes that contribute to breast cancer pathogenesis in humans, in order to identify potential therapeutic targets that, unlike driver mutations in deeply conserved protein-coding genes, are unlikely to be critical to the health of normal cells.