The expression of the genetic information inherent in our DNA includes 4 basic processes: 1) transcription into RNA, 2) splicing together the fragments of information in this RNA into messenger RNA, or mRNA;3) translation of the mRNA into proteins;4) modifying the proteins to make them effective. This proposal focuses on the second process, pre-mRNA splicing. Although the chemistry of the splicing reaction is fairly well understood, it is not yet clear as how the cell recognizes the demarcation of the few relatively short regions of the pre-mRNA that code for protein (the exons, ~100 nucleotides (nt) long, ~10 per transcript) within a long (~20,000 nt) pre-mRNA molecule. The splice sites themselves are comprised of sequences with specific features. For example, each spliced out region (the intron) almost always starts with a GT and ends with an AG sequence. However, the splice site sequences are not distinctive enough to provide an unambiguous mark. We will pursue 4 approaches with the aim of deciphering the """"""""splicing code,"""""""" i.e., the sequence elements and rules that allow recognition of splice sites lying within the sequence of the pre-mRNA or DNA: 1) We will add all possible sequences of 6 nt (4096) into a weakened exon to define the complete list of those that can enhance splicing. By repeating this experiments and comparing the sequences found after altering the exon in various ways, we will learn how different parts of the overall sequence interact to create a signal. These experiments exploit recently developed methods for massive sequencing of short regions of DNA. 2) We have found that limited intronic regions just outside the exon can play powerful roles in splice site recognition but little is known about the general nature or action of these sequences. We will investigate the effect of the position and protein-binding properties of these intronic enhancers on splicing and on chromatin structure. Our use of a cellular gene for this purpose is an improvement over less natural test systems currently in use. 3) It now appears that the density of signals influencing splicing is very high, so that any manipulation of a natural sequence is likely to change more than one signal at once. To minimize this effect we will build synthetic exons designed using insulated modules of known effect (enhancers and silencers of splicing). By placing these modules in various permutations, we will learn the rules governing their interactions. 4) Statistical analysis of the human genome sequence has allowed the successful prediction of exonic enhancers and silencers. We will extend such computational approaches to search for intronic and exonic signals that cooperate to enhance splicing and that may act to silence false splice sites. Many human genetic diseases are caused by splicing deficiencies and cancer cells often exhibit abnormal splicing patterns. A knowledge of the splicing code will enable this process to be targeted for therapeutic use, such as correcting a deficiency in a genetic disease, or disrupting a harmful splicing event in a tumor.

Public Health Relevance

Human genes control our lives by having their information translated into the proteins that operate our cells. That genetic information is present as fragments that must be spliced together to make any sense, and disruption of the splicing process causes many genetic diseases and can contribute to cancer. Our proposal is aimed at understanding how this splicing takes place.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM072740-06
Application #
8145635
Study Section
Molecular Genetics C Study Section (MGC)
Program Officer
Bender, Michael T
Project Start
2005-08-01
Project End
2014-08-31
Budget Start
2011-09-01
Budget End
2012-08-31
Support Year
6
Fiscal Year
2011
Total Cost
$453,721
Indirect Cost
Name
Columbia University (N.Y.)
Department
Biology
Type
Other Domestic Higher Education
DUNS #
049179401
City
New York
State
NY
Country
United States
Zip Code
10027
Ke, Shengdong; Anquetil, Vincent; Zamalloa, Jorge Rojas et al. (2018) Saturation mutagenesis reveals manifold determinants of exon definition. Genome Res 28:11-24
Arias, Mauricio A; Lubkin, Ashira; Chasin, Lawrence A (2015) Splicing of designer exons informs a biophysical model for exon definition. RNA 21:213-29
Ke, Shengdong; Shang, Shulian; Kalachikov, Sergey M et al. (2011) Quantitative evaluation of all hexamers as exonic splicing elements. Genome Res 21:1360-74
Ke, Shengdong; Chasin, Lawrence A (2011) Context-dependent splicing regulation: exon definition, co-occurring motif pairs and tissue specificity. RNA Biol 8:384-8
Ke, Shengdong; Chasin, Lawrence A (2010) Intronic motif pairs cooperate across exons to promote pre-mRNA splicing. Genome Biol 11:R84
Arias, Mauricio A; Ke, Shengdong; Chasin, Lawrence A (2010) Splicing by cell type. Nat Biotechnol 28:686-7
Zhang, Xiang H-F; Arias, Mauricio A; Ke, Shengdong et al. (2009) Splicing of designer exons reveals unexpected complexity in pre-mRNA splicing. RNA 15:367-76