Large consortium efforts have collected hundreds of genome-wide datasets that have delineated myriad regulatory regions, transcription factor binding sites and large numbers of coding and non-coding transcripts. Even with this massive amount of data, it remains a significant challenge to determine how the mapped elements function together in regulatory networks. This is due in large part to our inability to accurately and quantitatively detect all forms of nascent transcription, the instantaneous output of transcriptional regulation. Moreover, our understanding of global gene regulation is restricted by a lack of computational tools that seamlessly integrate genome-wide datasets. The overall goal of this proposal is to maximize the impact of nascent transcriptome studies and enable facile integration with other functional genomic data. My group developed native elongating transcript sequencing (NET-seq), that enables the strand-specific nucleotide-resolution mapping of RNA polymerase density, highlighting all transcriptional activity regardless of transcript half-lives and revealing precise positions of Pol II pausing where regulatory control is applied. Here, we will develop a new version of NET-seq ? NET-seq 2.0 ? that enables the routine, scalable and flexible application to diverse human cell types (or any eukaryotic system). Moreover, we will increase the potential of NET-seq analysis by developing two innovative bioinformatics strategies to seamlessly integrate NET-seq data with other genome-wide datasets that will have applications beyond NET-seq studies. To demonstrate the broad utility of our integrated approach, we will study regulatory networks and cell differentiation for which instantaneous nascent transcriptional analysis will be highly impactful.
In Aim 1, our goal is to make NET-seq easier, cheaper, and more flexible. Our improvements will reduce background and increase usable reads, dramatically reduce cell input requirements (100-1000-fold), enable dense, region-specific RNA transcription analyses, and enable quantitative comparisons between samples and conditions.
In Aim 2, we will determine transcription kinetics through integrating NET-seq with metabolic RNA labeling (TT-seq) data which report local synthesis rates. This integrative approach yields a rich transcriptional phenotype that we will use to develop gene regulatory network models.
In Aim 3, we will create new computational algorithms that circumvent the need to determine each molecular event separately, and instead infer the status of unmapped events using information-rich datasets, such as NET-seq. We will use integrative deep neural networks (`deep-learning') that use available genome-wide datasets to predict unavailable datasets from data already on hand. We will apply this approach to study erythropoiesis using a well- defined primary human hematopoietic differentiation system by a time series NET-seq and DNase-seq analysis. These data will inform deep neural network models to predict ChIP-seq data for myriad transcription factors and chromatin marks to investigate key regulatory events without additional expense.
The proposed research is relevant to public health, because discovery of regulatory mechanisms in transcription at high resolution is ultimately expected to significantly impact our understanding of most human disease. As such, the proposed research is relevant to the part of the NIH's mission that seeks to develop fundamental knowledge to inform our diagnosis and treatment of human disease.
Mischo, Hannah E; Chun, Yujin; Harlen, Kevin M et al. (2018) Cell-Cycle Modulation of Transcription Termination Factor Sen1. Mol Cell 70:312-326.e7 |
Doris, Stephen M; Chuang, James; Viktorovskaya, Olga et al. (2018) Spt6 Is Required for the Fidelity of Promoter Selection. Mol Cell 72:687-699.e6 |
Jin, Yi; Eser, Umut; Struhl, Kevin et al. (2017) The Ground State and Evolution of Promoter Region Directionality. Cell 170:889-898.e10 |
Harlen, Kevin M; Churchman, L Stirling (2017) The code and beyond: transcription regulation by the RNA polymerase II carboxy-terminal domain. Nat Rev Mol Cell Biol 18:263-273 |
Mayer, Andreas; Landry, Heather M; Churchman, L Stirling (2017) Pause & go: from the discovery of RNA polymerase pausing to its functional implications. Curr Opin Cell Biol 46:72-80 |
Winter, Georg E; Mayer, Andreas; Buckley, Dennis L et al. (2017) BET Bromodomain Proteins Function as Master Transcription Elongation Factors Independent of CDK9 Recruitment. Mol Cell 67:5-18.e19 |
Boswell, Sarah A; Snavely, Andrew; Landry, Heather M et al. (2017) Total RNA-seq to identify pharmacological effects on specific stages of mRNA synthesis. Nat Chem Biol 13:501-507 |
Mayer, Andreas; Churchman, L Stirling (2017) A Detailed Protocol for Subcellular RNA Sequencing (subRNA-seq). Curr Protoc Mol Biol 120:4.29.1-4.29.18 |
Harlen, Kevin M; Churchman, L Stirling (2017) Subgenic Pol II interactomes identify region-specific transcription elongation regulators. Mol Syst Biol 13:900 |
Harlen, Kevin M; Trotta, Kristine L; Smith, Erin E et al. (2016) Comprehensive RNA Polymerase II Interactomes Reveal Distinct and Varied Roles for Each Phospho-CTD Residue. Cell Rep 15:2147-2158 |
Showing the most recent 10 out of 14 publications