The overall goal of this project is to further our understanding of alternative polyadenylation (APA) in eukaryotes, including the extent to which it occurs and the mechanisms by which it is regulated. Maturation of eukaryotic mRNA requires the actions of several post-transcriptional processes, one of which is 3'-end formation. This process consists of pre-mRNA cleavage and polydenylation, where a poly(A) tail is added to the newly cleaved 3'-end of pre-mRNA. It is guided by poly(A) signal motifs on the pre-mRNA. Polyadenylation has been shown to influence mRNA stability, translatability and transportability from the nucleus to cytoplasm. To complicated matters, many eukaryotic genes have more than one poly(A) site. Alternative polyadenylation (APA) is a phenomenon in which these different poly(A) sites are utilized to generate mRNA transcripts with different 3'-ends of the same gene. Increasing evidence suggests that APA is a key contributor in regulating gene expression, affecting mRNA levels and/or functions of coded proteins. It also affects the nature and length of the 3'-UTR harboring potential cis-regulatory motifs that are important for mRNA stability and translation suppression. APA appears to be regulated by both developmental and environmental cues, and it often occurs in a tissue- and/or disease-specific manner. Mutations of poly(A) signals or polyadenylation protein factors cause severe diseases including cancers. However, there are many important unanswered questions about underlying mechanisms of APA. For example, it is unclear how developmental and environmental cues are transduced to the APA process, or what special poly(A) signals (i.e., special RNA structures) are needed to guide APA. Next-Generation sequencing (NGS) has revolutionized transcriptome sequencing, allowing us to achieve considerably higher sequence depth and coverage than could be achieved through Sanger sequencing. The resulting high-volume NGS datasets make it possible to employ innovative new methodologies into our bioinformatics analyses of polyadenylation that will advance our understanding of APA and its regulatory mechanisms. Specifically, poly(A) sites will be more accurately annotated and cataloged based on their locations (e.g. within introns vs. exons), types (e.g. antisense vs. sense), and biological consequences (e.g. non-sense mediated decay, protein product truncation, and modification of micro-RNA binding sites). We will study how APA patterns and relevant cis-regulatory motifs have changed in different species, genotypes, tissues, developmental stages and environmental stresses. We will validate our predicted findings through wet lab protocols. Finally we will release our data, protocols and software to the research community through a richly analyzed and visualized online database and web service. Completion of the aims of this project for our model organisms will pave the way to more complete understanding of these complex regulatory mechanisms in Humans and other eukaryotes.
When eukaryotic genes are expressed, their mRNAs have to be processed during maturation, a step of which is polyadenylation: the attachment of a poly(A) tail to mark and protect the end of mRNA. However, alternative polyadenylation (APA) at a different poly(A) site of mRNA will result in information lose of the mRNA and has been linked to cause cancers and other diseases. Using Next-Generation Sequencing data, we will employ innovative methodologies to accurately annotate poly(A) sites, detect poly(A) signals, examine APA and its regulatory mechanisms, and provide biomedical/biological research communities with an information rich database that will advance our understanding of APA role in human diseases.
Showing the most recent 10 out of 16 publications