Next-generation sequencing and chromatin immunoprecipitation (ChIP) experiments are generating genome-wide datasets of epigenetic modifications that describe cellular states. The recent ENCODE project has generated hundreds of datasets using genome-wide approaches to map protein-DNA interactions. These dynamic chromatin state maps reveal many thousands of putative cis regulatory modules (CRMs) in the genome, far outnumbering the numbers of genes. These cis regulatory modules are thought to modulate gene expression through the recruitment of specific combinations of trans acting factors, such as transcription factors (TF) and non-coding RNAs. In parallel, genome-wide association studies (GWAS) have mapped thousands of single nucleotide polymorphisms (SNPs) in non-coding regions, suggesting that polymorphisms may be altering gene expression by affecting binding of regulatory trans factors to CRMs. Despite these advanced techniques to localize CRMs in the genome, we currently lack robust high throughput approaches to discover the proteins that interact with these CRMs and characterize their functional roles. To achieve this goal, we propose to: (1) experimentally validate the dynamic recruitment of muscle TFs to novel CRMs and study protein-CRM interactions in regulated genes with proteomics analyses;(3) Develop in- vivo bait approaches to observe protein-DNA interactions in live cells. By focusing on genes significantly up- regulated at the transcript and protein level during muscle differentiation, we will compare proteins bound at novel regulatory loci with neighboring control sequences to identify novel TFs bound at candidate CRMs. In addition to building a powerful toolbox for unbiased proteomic characterization of proteins interacting with specific genomic loci, we will apply our technologies to study candidate CRMs and known muscle regulatory loci surround highly regulated genes in the well characterized C2C12 muscle differentiation model. This work will provide genome biologists with new approaches to identify novel transcription factors and a clearer understanding of the functional significance of CRMs in gene regulation.
Even subtle changes in the interactions of protein with specific sequences in DNA can cause multi-factorial disease like cancer, type 2 diabetes, or schizophrenia. New genome-wide approaches are identifying new regulatory sites in the genome but we currently lack effective tools to discover the cohort of interacting proteins at these sites. We propose to develop robust, sensitive and specific approaches to identify proteins at genomic regulatory elements in order to study their effect on gene expression.
|Lau, Ho-Tak; Suh, Hyong Won; Golkowski, Martin et al. (2014) Comparing SILAC- and stable isotope dimethyl-labeling approaches for quantitative proteomics. J Proteome Res 13:4164-74|