Intellectual Merit: This project will elucidate the biological roles and mechanisms of recursive splicing, a recently discovered process that is specific to genes that are interrupted by very large introns. Such introns are an important component of many vertebrate and invertebrate genomes and they must be removed accurately from the pre-mRNA transcripts during gene expression to avoid introducing mis-sense or nonsense mutations that preclude correct protein production. Extremely large introns are frequently found in genes with key roles in development and cellular regulation. Current evidence indicates that recursive splicing has a widespread role in promoting the proper expression of genes with large introns, but the underlying mechanisms remain to be elucidated. This process may facilitate efficient and accurate removal of the introns, or it may stimulate other steps in gene expression through interactions mediated by the splicing machinery. Splicing is coupled physically and functionally to transcription and other processes in mRNA biogenesis, and emerging evidence indicates that splicing itself can enhance gene expression through effects on transcript elongation, re-initiation of transcription, polyadenylation, and export of mature mRNA from the nucleus. The experimental system used in this project is the fruit fly Drosophila melanogaster. Recursive splicing has been characterized most extensively in this organism but suggestive evidence that it also occurs in higher animals has been presented. In addition, Drosophila provides powerful genetic and molecular resources for analysis of recursively spliced transcription units. The specific aims are: (1) Characterize the roles of recursive splicing in gene expression. Allele substitution techniques will be used to delete non-exonic recursive splice sites in selected but diverse genes at their native chromosomal locations. The effects on developmental phenotypes, transcription and RNA processing from the corresponding genes will be determined. (2) Characterize auxiliary elements and mechanisms for correct use of a non-exonic recursive splice site. Mutational analyses in a Drosophila cell transfection system will be used to dissect and characterize the function of sequences on the RNA that direct the use of recursive splice site RP3 in the Ultrabithorax gene. (3) Identify trans-acting factors that mediate the activity and functions of recursive splice sites. Genetic approaches will be used to identify factors required for correct recursive splicing at Ultrabithorax and frizzled and/or to mediate its role(s) in gene function. Biochemical and molecular approaches will be used to further characterize the mechanisms of identified factors.

Broader impacts: This project will lead to a better understanding of pre-mRNA splicing mechanisms and strategies and their relation to other aspects of gene expression and gene structure. This is important for developing integrated models of genetic control that can have practical impact in agriculture and animal breeding, pest control, and understanding the consequences of mutation and variations in genome sequence within populations. Information generated by this project will be incorporated into an existing electronic database on recursive splicing as a publicly available resource. The project will provide research training for 2-4 graduate and 6-8 undergraduate students over a three-year period. At both levels, this will involve integrated training in experimental, computational, and comparative approaches. Three undergraduate researchers have already contributed importantly to published studies leading to this project (4 undergraduate co-authorships on 2 papers during the past two years). They have gone on to top Ph.D. programs in experimental and computational biology. Additional undergraduates will be involved, including students recruited through programs to enhance diversity. The principal investigator integrates research with education and outreach by teaching undergraduate and graduate courses in related subjects and by participating as an instructor in the Pittsburgh Supercomputing Center's Minority Access to Research Careers Summer Institute In Bioinformatics.

Project Report

Complex disease traits like schizophrenia, diabetes, heart disease and many others characterized by multigene inheritance in combination with environmental effects pose some of the greatest challenges for society because of their frequency and the difficulty of identifying the underlying genetic variations. Current methods can map the locations of candidate genes, but identifying the actual genetic variations that have a causal role (among many possible local candidates) is difficult, especially since such variants are not necessarily within the easily recognized protein-coding regions of genes. Results form this project provide examples of novel elements within introns (which account for most of a gene's length but do not contribute to its protein product) that aid their removal from the gene’s initial RNA transcript and whose mutation can impair the expression of the gene products. These complex elements are called recursive splice sites (RSSs). A specific illustration is provided by the analysis of predicted human RSSs, which led to identification of a specific example (in the dopamine reuptake transporter gene) whose function is influenced by single-nucleotide genetic variants (SNPs) that are present in human populations and that may influence risk for schizophrenia. These sequence variations are in regulatory elements relatively far from the exon splice sites, and thus would not have been immediately obvious candidates as causal SNPs. The results of this project indicate that RSSs contribute to proper expression of genes in the model organism Drosophila by aiding the accuracy of splicing out large introns and by helping to promote transcription of the RNA. The results also elucidate how RSSs, which consist of overlapping splice acceptor and donor motifs that would be expected to interfere with each other, can actually function effectively and in a coordinated matter as a consequence of distinctive nucleotide sequence features and the action of regulatory ribonucloprotein complexes. Additional results provide novel examples of how mobile genetic elements (non-LTR retroelements in this case) can modify gene structure and expression during evolution, in this case through the introdution of RSSs that can eventually evolve into protein-coding exons. The finding of RSSs specifically in those non-LTR retroelements that form the specialized ends of chromosomes in some insects suggests that recursive splicing may play a role in control of chromatin structure and/or expression of genes in particular chromatin contexts. This project has provided research training in experimental and computational biology to three graduate students and five undergraduates, all of whom have moved on to careers in biomedical research and/or health professions. In addition, the PI has participated in synergistic education outreach activities designed to enhance the participation of underrepresented groups in computational and biological research. Another synergistic activity has been the design and teaching of a two-semester genomics course for freshmen based on a real hands-on research project. This provides students with an integrated conceptual and technical view of molecular genetics and the research process at the very start of their undergraduate program. The project ownership and excitement developed by these students provides enormous motivation that carries through their entire undergraduate careers.

Agency
National Science Foundation (NSF)
Institute
Division of Molecular and Cellular Biosciences (MCB)
Application #
0821202
Program Officer
Karen C. Cone
Project Start
Project End
Budget Start
2008-09-01
Budget End
2012-08-31
Support Year
Fiscal Year
2008
Total Cost
$480,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213