Originally thought to be a relatively uncommon phenomenon, alternative splicing is now appreciated to be a widespread and primary mechanism by which eukaryotes have expanded the structural and functional diversity of their encoded proteome. The new generation of ultra-high throughput sequencers has opened up new ways to study the cell?s alternative splicing and its variation in response to environmental conditions. Accurate characterization of the transcriptome from the hundreds of millions of random short sequences sampled from messenger RNA samples, however, is still an unsolved problem. The PI proposes a research plan involving a set of novel computational approaches to address the issue. It includes: (1) a maximum likelihood approach to achieve highly sensitive and accurate identification of both novel and known splicing and fusion events; (2) a genome-wide transcriptome comparison method to detect statistically significant differential alternative splicing patterns across biological samples; and (3) data mining algorithms to reconstruct co-regulated splicing networks carrying out specific biological functions.
The successful implementation of this research plan will produce a suite of computational and statistical methods implemented as open source software to meet the immediate demand from the biology community for the analysis of high throughput RNA-seq datasets. These tools will enable individual scientists to assess the mRNA transcriptome in a matter of days using samples from any organisms with a reference genome.
The integrated educational program aims to increase the awareness of bioinformatics as a critical interdisciplinary research area among both undergraduate and graduate students from biology, computer science, and engineering at the University of Kentucky (UKy). The PI is committed to enrich the undergraduate curriculum with a new introductory bioinformatics course and to improve cross-disciplinary research training opportunities for graduate and undergraduate students through the Bioinformatics Certificate Program and a newly established Biomedical Informatics Department at UKy. In addition, the PI will emphasize recruitment and retention of under-represented groups including female students and students from the Appalachian region through the NSF funded AMSTEMM program at UKy.