Colorado State University is awarded a grant to develop machine learning methods for detecting alternative splicing in plants and to experimentally validate selected predictions. Alternative splicing has an important role in proteome diversity and gene regulation. Recent studies of large scale EST/cDNA datasets have revealed that the prevalence of alternative splicing in plants is much larger than expected, reaching around 30% of the genes, which is still significantly less than in human and mouse. This is primarily due to the much smaller amount of cDNA/EST data that is available in plants. Therefore we are likely far from the true extent of alternative splicing in plants. In human and mouse, several projects have made non-EST-based predictions of alternative splicing; none have been reported in plants to our knowledge. To fill this gap, the PIs will develop computational tools to predict novel alternative splicing events and the cis-elements involved in regulated alternative splicing. Alternative splicing in plants has different characteristics than in animals, and the proposed computational and experimental work will help elucidate the mechanistic basis for these differences. The initial focus will be in Arabidopsis, and the methods will be extended to rice and other plants for which genome and EST data are available. The end-results of the proposed research will be the creation of a web-accessible database of predicted and validated alternative splicing events and cis-elements; the software developed during the course of this project will be made available for researchers interested in predicting alternative splicing in other plant species.

Project Report

Alternative splicing (AS) is the process whereby a given gene can produce several mRNAs (splice forms). AS exhibits different characteristics in plants vs animals, and while it is well studied in animals, much work remains to be done in plants. Our focus during the course of this award has been the study of AS in plants using computational (in Dr. Ben-Hur's lab) and experimental techniques (in Dr. Reddy's lab). The development of high throughput sequencing as a method for studying the transcriptome, known as RNA-seq, is providing opportunities for studying alternative splicing at an unprecedented scale. And yet, the computational methods for this task are still immature, and are focused on the characteristics of mammalian gene structure and the properties RNA-seq data in those systems. To address this we have developed two computational methods: SpliceGrapher and iDiffIR. SpliceGrapher addresses the challenging problem of inferring genome annotations from RNA-seq data. This is an extremely challenging problem, and most available methods have very low precision. With SpliceGrapher we have focused on a slightly easier problem than predicting whole transcripts, namely predicting splice graphs. At this level, which is ideal for studying AS, it achieves superior performance in comparison to the commonly used methods in this area. The second method we have developed, iDiffIR is designed to detect intron retention events (the major form of AS in plants) that occur preferentially under one experimental condition vs another. No other such tools exist specifically for this task, and we believe it will be a valuable resource for the community. The collaboration between the two labs has allowed us to extensively test and refine our tools. The second major thrust of the award has been the experimental study of AS in plants. This work was focused on i) global analysis of AS in mutants that lack one or more splicing regulators, and ii) elucidating the mechanisms by which AS is regulated. During the course of this work we generated seven mutants (three single, three double and a triple mutant) that lack one, two or three splicing regulators. We have generated RNA-seq data using the Illumina platform from wild type and mutants. These data were used extensively to test and refine the computational methods developed in Dr. Ben Hur’s laboratory. In addition, analysis of these data has provided information on the roles of specific splicing factors in differential gene expression and splicing. We have validated many of the predicted differentially expressed and alternatively spliced genes. Results from the computational analysis of the RNA-seq data were used to predict and verify biological functions of several splicing regulators. These results have uncovered key roles of splicing regulators in abiotic stresses in plants. Using the mutants, we have developed an in vivo assay to analyze the functions of single or combinations of splicing factors in regulating alternative splicing. This work has also led to the identification of RNA sequences that bind to specific splicing factors and contribute to regulated splicing. Overall, this work has provided new insights into the roles of splicing factors in regulating AS and stress responses. This research project has allowed us to train three Ph.D students, one MS student, one postdoctoral research associate and several undergraduate students.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
0743097
Program Officer
Julie Dickerson
Project Start
Project End
Budget Start
2008-03-01
Budget End
2013-02-28
Support Year
Fiscal Year
2007
Total Cost
$1,086,612
Indirect Cost
Name
Colorado State University-Fort Collins
Department
Type
DUNS #
City
Fort Collins
State
CO
Country
United States
Zip Code
80523