Genomic DNA is the information repository of the cell, encoding the myriad of proteins required to sustain life. To harness this information, cells depend on RNA polymerases - dynamic biomolecular machines that first transcribe the genetic code into RNA. Transcription is a complex and highly regulated process that governs cell growth, differentiation, development and all responses to environmental change. Importantly, the biochemical pathways that orchestrate the expression and repair of genes are intricately intertwined. As a consequence, many human diseases trace their origins to deficiencies in gene regulation or DNA repair. Understanding the molecular-level mechanisms that underlie gene expression and transcription-coupled DNA repair (TCR) is a grand challenge in biomedical science. Progress toward this goal has been hindered by the size, complexity and dynamic nature of the assemblies that accomplish transcription and TCR. In initial studies with our experimental collaborators we combined computational modeling with cryo-electron microscopy data to determine structures of transcription preinitiation complexes (PICs) from all three classes of RNA polymerases (Pol I, Pol II and Pol III). The structures captured the PICs in multiple functional states covering the path from promoter recognition to the formation of a proficient elongation complex. These results offer an unprecedented opportunity for integrative modeling to connect the experimentally observed states, delineate DNA remodeling during the early stages of transcription and uncover the critical mechanisms of transcription regulation. Specifically, we will leverage computational and structural systems biology approaches to 1) determine how the Pol I, II and III transcription machineries recognize and open promoter DNA; 2) examine how the transcription factor TFIID associates with promoter DNA and serves as a platform for assembling the PIC; and 3) uncover the key functions of two recognized TCR master coordinators, transcription factor IIH (TFIIH) and Cockayne Syndrome B protein (CSB). Our work will benefit from synergistic collaborative interactions with world-class experimental groups to inform, validate, and extend our models. Parallel advances in computation and cryo-EM will yield key insights into the structure, dynamics and function of gene regulatory complexes while making direct connection to genetic disease phenotypes. Success of the project will thus have major impacts - both in understanding the etiology of cancers and inherited genetic disorders and in offering a structural framework to devise effective treatments.
The project will leverage new computational and structural systems biology approaches to provide unified knowledge of the assembly, function and regulation of key transcription and transcription-coupled DNA repair complexes. Unveiling the interplay of molecular-level mechanisms and disease mutations in these complexes will yield foundational understanding of severe genetic disorders associated with cancer, aging, and developmental defects ? xeroderma pigmentosum, trichothiodystrophy, Cockayne syndrome and related diseases. Success of the project will thus have major impacts on understanding human disease etiology, opening new avenues to devise effective treatments.