Using high density tilling microarray and/or directional RNA-seq techniques, it was recently found that alternative and varying-level (dynamic) transcriptions within operons are highly prevalent and the non-coding RNA (ncRNA) and anti-sense RNA (asRNA) transcriptions are also highly pervasive. Thus prokaryotic transcriptomes seem to be more complex than previously thought. However, little is known about the patterns and rules of the transcriptomic complexity and its biological implications, as well as its underlying molecular mechanisms. Moreover, the generality of such transcriptomic complexity remains unknown because inconsistent and even contradictory results have been reported even in the same strains. Furthermore, the investigation of the transcriptomic complexity in prokaryotes has been hindered by the highly biased sequence reads of the current directional RNA-seq techniques that is further confounded in prokaryotes for the highly labile nature and extremely low concentrations of their RNAs in the cells, and the lack of an effective method for their enrichment, leading to highly non-uniform read coverage, and even numerous uncovered gaps in transcribed regions. Such highly non-uniform read coverage and prevalent uncovered gap make it very difficult to assembly full-length transcripts, let alone to detect dynamic transcriptions along operons. Consequently, little is known about the transcriptomic complexity in many medically important prokaryotes, and even in the most widely-studied model bacterium E. coli K12. This project plans to address these problems using E. coli K12 as the model system.
The specific aims are: 1) to develop an algorithm and tool for sufficiently correcting the read biases in the current RNA-seq techniques;2) to develop an accurate and efficient algorithm and tool for simultaneously assembling prokaryotic full-length transcripts and detecting possible dynamic transcriptions along the assembled transcripts using RNA-seq short reads;3) to characterize the patterns and biological roles of alternative and dynamic operon utilizations as well as asRNA and ncRNA transcriptions in E. coli K12. Accomplishment of this project will not only further our understanding of the global architecture and complexity of the transcriptomes in E. coli K12, but also will provide the research community with computational tools and experimental methods to address the similar questions in other prokaryotes, thereby facilitating the community efforts to decipher gene regulatory networks in all sequenced prokaryotic genomes. A better understanding of the gene regulatory networks of medically, agriculturally and industriously important prokaryotes will enhance our ability to prevent and cure infectious diseases, and to produce foods and other important products.>

Public Health Relevance

The discoveries from this project will lead to an unprecedentedly detailed and holistic understanding of the complex gene transcriptions in E. coli K12 and the roles they play in bacterial responses to environmental changes. This project will also provide the research community with accurate and effective tools for studying the complex gene transcriptions in numerous medically important bacteria. As gene transcription plays crucial roles in bacterial infections, a better understanding of the complex gene transcription in bacteria will help design new strategies to prevent and cure many infectious diseases.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of North Carolina Charlotte
Biostatistics & Other Math Sci
Other Domestic Higher Education
United States
Zip Code
Hou, Yingnan; Gao, Bo; Li, Guojun et al. (2018) MaxMIF: A New Method for Identifying Cancer Driver Genes through Effective Data Integration. Adv Sci (Weinh) 5:1800640
Xu, Chen; Su, Zhengchang (2015) Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31:1974-80
Li, Shan; Dong, Xia; Su, Zhengchang (2013) Directional RNA-seq reveals highly complex condition-dependent transcriptomes in E. coli K12 through accurate full-length transcripts assembling. BMC Genomics 14:520
Li, Shan; Xu, Minli; Su, Zhengchang (2010) Computational analysis of LexA regulons in Cyanobacteria. BMC Genomics 11:527