How cell-type specific gene expression programs are established and maintained is a fundamental question in molecular biology. In mammalian cells, hundreds of sequence-specific transcription factors have been catalogued, and they bind the regulatory regions of their target genes in cell-type specific and combinatorial occupancy patterns. Moreover, the developmental programs that generate different cell lineages are accompanied by complex chromatin remodeling. Increasing evidence suggests that the regulatory regions of cell-type specific genes may often be established and sometimes """"""""poised"""""""" by chromatin marks at earlier stages in development. However, the detailed characterization of gene regulatory regions-including their initial establishment in earlier progenitor cells, the dynamics of their chromatin state, and the combinatorial control of gene transcriptional output by multiple transcription factors-has only been studied for a handful of developmentally important genes. The goal of this project is to develop new integrative computational methods that exploit massive next- generation sequencing data sets to fundamentally advance our understanding of cell-type specific transcriptional programs. We will develop integrative computational analysis methods for (1) learning the sequence and chromatin determinants of transcription factor binding from ChIP-seq and DNase-seq;(2) mapping the landscape of chromatin accessibility of all regulatory regions in the human and mouse genomes using DNase-seq across all available cell types, dissecting the poising of their chromatin state in earlier progenitor cells, and extracting the sequence code governing their gain and loss in differentiation;and (3) modeling cell-type specific gene expression programs as a function of chromatin state, transcription factor binding, and regulatory sequence analysis. We will couple our computational methods development with targeted experimental validation, including both locus-specific and genome-wide assays.
Understanding gene regulation is fundamental to the study of normal cellular processes as well as disease. In this project, we develop computational methods to exploit multiple sources of large-scale genomics data enabled by next-generation sequencing technology in order to provide new tools for studying gene regulation in mammalian cells.
|Garrett-Bakelman, Francine E; Sheridan, Caroline K; Kacmarczyk, Thadeous J et al. (2015) Enhanced reduced representation bisulfite sequencing for assessment of DNA methylation at base pair resolution. J Vis Exp :e52246|
|González, Alvaro J; Setty, Manu; Leslie, Christina S (2015) Early enhancer establishment and regulatory locus complexity shape transcriptional programs in hematopoietic differentiation. Nat Genet 47:1249-59|
|Setty, Manu; Leslie, Christina S (2015) SeqGL Identifies Context-Dependent Binding Signals in Genome-Wide Regulatory Element Maps. PLoS Comput Biol 11:e1004271|
|Pelossof, Raphael; Singh, Irtisha; Yang, Julie L et al. (2015) Affinity regression predicts the recognition code of nucleic acid-binding proteins. Nat Biotechnol 33:1242-1249|
|Shih, Alan H; Jiang, Yanwen; Meydan, Cem et al. (2015) Mutational cooperativity linked to combinatorial epigenetic gain of function in acute myeloid leukemia. Cancer Cell 27:502-15|
|Li, Sheng; Mason, Christopher E (2014) The pivotal regulatory landscape of RNA modifications. Annu Rev Genomics Hum Genet 15:127-50|
|Li, Sheng; ?abaj, Pawe? P; Zumbo, Paul et al. (2014) Detecting and correcting systematic variation in large-scale RNA sequencing data. Nat Biotechnol 32:888-95|
|Li, Sheng; Garrett-Bakelman, Francine; Perl, Alexander E et al. (2014) Dynamic evolution of clonal epialleles revealed by methclone. Genome Biol 15:472|
|SEQC/MAQC-III Consortium (2014) A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat Biotechnol 32:903-14|
|Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng et al. (2014) An integrative computational approach for prioritization of genomic variants. PLoS One 9:e114903|
Showing the most recent 10 out of 18 publications