Most vertebrate genes contain multiple introns which must be precisely removed from the primary transcript prior to its export from the nucleus to create the proper mRNA to direct translation. The process of RNA splicing which is responsible for removal of introns and ligation of exons is therefore an essential step in the expression of most genes. However, the basis for the specificity of this process is not well understood. The goal of this proposal is to understand the rules which are used by the vertebrate RNA splicing machinery to identify exons, introns and splice sites in primary transcripts and to encode these rules in computer programs which predict the splicing pattern of an arbitrary input primary transcript sequence. This will be accomplished by in-depth computational and statistical analysis of available primary transcript and mRNA sequences of vertebrate genes, taking advantage of the recent progress of large-scale genome sequencing and cDNA sequencing efforts. The approach will involve: 1) analysis of the detailed compositional properties of 5' and 3' splice signals and branch signals of vertebrate introns; 2) identification of exonic and intronic splicing enhancers and repressors; and 3) integrated computer models of slicing specificity enhancers and repressors; and 3) integrated computer models of splicing specificity. A variation of the Gibbs sampling algorithm will be used to characterize the branch signal and other signals which occur at a characteristic but variable distance from splice junctions. Clustering algorithms will be used to identify natural subgroups of 5' and 3' splice signals composition and to assign scores to potential splice signals. A statistical approach will be applied for identifying short sequence motifs which are likely to function as exonic or intronic splicing enhancers or repressors based on differences in oligonucleotide composition between exons and introns with weak versus strong splice signals. Conservation of putative splicing enhancers and repressors between homologous exons and introns from different vertebrates will be explored. As knowledge accumulates about splicing specificity, it will be integrated into computer models which predict the splicing patterns of primary transcripts. These models will be adapted to the problems of gene identification in genomic sequences and prediction of the splicing phenotypes of human mutations and polymorphisms. Deciphering the 'splicing code' will be essential to understanding the basis of alternative splicing, an important regulatory mechanism involved in development, differentiation and apoptosis. Computational methods for predicting splicing patterns will also aid in identification of genes including human disease genes and for understanding the effects of disease gene mutations, approximately 15% of which affect splicing.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
1R01HG002439-01
Application #
6422364
Study Section
Genome Study Section (GNM)
Program Officer
Good, Peter J
Project Start
2002-02-06
Project End
2007-01-31
Budget Start
2002-02-06
Budget End
2003-01-31
Support Year
1
Fiscal Year
2002
Total Cost
$381,000
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02139
Pai, Athma A; Henriques, Telmo; McCue, Kayla et al. (2017) The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture. Elife 6:
Taliaferro, J Matthew; Lambert, Nicole J; Sudmant, Peter H et al. (2016) RNA Sequence Context Effects Measured In Vitro Predict In Vivo Protein Binding and Regulation. Mol Cell 64:294-306
Taliaferro, J Matthew; Vidaki, Marina; Oliveira, Ruan et al. (2016) Distal Alternative Last Exons Localize mRNAs to Neural Projections. Mol Cell 61:821-33
Katz, Yarden; Wang, Eric T; Silterra, Jacob et al. (2015) Quantitative visualization of alternative exon expression from RNA-seq data. Bioinformatics 31:2400-2
Merkin, Jason J; Chen, Ping; Alexis, Maria S et al. (2015) Origins and impacts of new mammalian exons. Cell Rep 10:1992-2005
Lambert, Nicole; Robertson, Alex; Jangi, Mohini et al. (2014) RNA Bind-n-Seq: quantitative assessment of the sequence and structural binding specificity of RNA binding proteins. Mol Cell 54:887-900
Shalgi, Reut; Hurt, Jessica A; Krykbaeva, Irina et al. (2013) Widespread regulation of translation by elongation pausing in heat shock. Mol Cell 49:439-52
Spies, Noah; Burge, Christopher B; Bartel, David P (2013) 3' UTR-isoform choice has limited influence on the stability and translational efficiency of most mRNAs in mouse fibroblasts. Genome Res 23:2078-90
Han, Hong; Irimia, Manuel; Ross, P Joel et al. (2013) MBNL proteins repress ES-cell-specific alternative splicing and reprogramming. Nature 498:241-5
Hurt, Jessica A; Robertson, Alex D; Burge, Christopher B (2013) Global analyses of UPF1 binding and function reveal expanded scope of nonsense-mediated mRNA decay. Genome Res 23:1636-50

Showing the most recent 10 out of 30 publications