While the average human gene is on the order of 10-20 kb in length, the human genome also contains a significant number of genes which are much longer. Some genes can exceed one million base pairs in length. Many of these long genes also contain introns of hundreds of kilobases in length. These features represent an extreme challenge to the processes of transcription and RNA splicing. For RNA splicing, very large introns present the problem of identifying the correct splice sites and exons in spite of a background of similar "decoy" sequences present within the introns. The current models of splicing signals cannot properly predict the splicing pattern of large genes. To begin to address some of these problems, we have recently developed methods to measure the rate of RNA polymerase II (RNAPII) elongation and RNA splicing in large human genes in their in situ chromosomal locations and normal chromatin environments. We have shown that transcription proceeds rapidly in large genes and that splicing occurs co-transcriptionally within minutes of synthesis regardless of the length of the intron. We now have evidence that large introns are spliced in a single event implying that mechanisms must exist to suppress the use of decoy splice sites. We propose to use a variation of our previous experiment to investigate the temporal order of splicing factor binding to long introns in order to test current theories of exon definition. In addition to the splicing signals contained in the pre-mRNA, it is possible that splicing information could also be encoded in the structure of chromatin along genes. To support this idea, we have shown that exons are enriched in nucleosomes relative to adjacent intron sequences and that these exonic nucleosomes are also enriched in specific histone methyl marks. We propose to determine the function of these methyl marks by selectively removing or enhancing them by knocking down or over-expressing the specific methyltransferases. We will confirm these alterations by ChIP analysis and then we will measure the rate and fidelity of RNA splicing. Changes in alternative splicing will be detected by transcriptome analysis. Many accessory factors for RNAPII transcription elongation have been identified in in vitro and in vivo studies. However, few if any of these have been shown to be required for elongation of RNAPII in vivo in mammalian cells. Large genes in particular should be dependent on optimum elongation of RNAPII thus making these genes potentially useful for the analysis of these factors. We propose to examine these rates following modification of the gene expression machinery. First, we will use RNAi knockdowns of elongation factors to determine their in vivo roles in the transcription of long genes. We will also use mutant versions of RNAPII containing truncations and modifications of the important C-terminal domain to address the roles of this domain in transcription and splicing in large genes. These studies will advance our understanding of human gene expression which is of major relevance to both normal and pathological cell growth and development.

Public Health Relevance

The regulated expression of genes is central to human growth, development, normal and pathological functioning and the response of the body to changes in the internal and external environment. This proposal is designed to understand how in their natural chromosomal environment are correctly expressed in human cells. In particular, we propose experiments that probe expression mechanisms in genes that are substantially larger than average. Such large genes include several tumor suppressor genes which are inactivated in many human cancers. We hope to learn the rules and identify the factors that play roles in the expression of large genes in order to understand, predict and perhaps prevent the aberrrant expression of genes in pathological conditions.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Molecular Genetics B Study Section (MGB)
Program Officer
Bender, Michael T
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Cleveland Clinic Lerner
Other Basic Sciences
Schools of Medicine
United States
Zip Code
Jafarifar, Faegheh; Dietrich, Rosemary C; Hiznay, James M et al. (2014) Biochemical defects in minor spliceosome function in the developmental disorder MOPD I. RNA 20:1078-89
Makishima, Hideki; Visconte, Valeria; Sakaguchi, Hirotoshi et al. (2012) Mutations in the spliceosome machinery, a novel and ubiquitous pathway in leukemogenesis. Blood 119:3203-10
Padgett, Richard A (2012) New connections between splicing and human disease. Trends Genet 28:147-54
He, Huiling; Liyanarachchi, Sandya; Akagi, Keiko et al. (2011) Mutations in U4atac snRNA, a component of the minor spliceosome, in the developmental disorder MOPD I. Science 332:238-40