We are analyzing the mouse genome using new analytical tools and statistics to compare the results of several next generation sequencing (NGS) experiments. Data from ChIPseq, microarray and RNAseq experiments were included for analysis in order to further assess the role of HMGN1 and HMGN2 proteins in chromatin organization and gene expression. We developed analysis pipelines for ChIP-seq experiments of DNA sequences bound to HMGN1 and HMGN2 in wildtype and double knockout mice. The outcome of these collaborations is that we have developed an efficient and adjustable pipeline for the analysis of many NGS datasets in a reasonable time and can easily interrogate the data to further develop biological interpretations and devise new questions. We have analyzed epigenetic marks across the mouse genome in a variety of cell types to assess the changes that HMGN protein deficiency result in in the double knockouts, particularly in enhancer and super-enhancer regions in mouse embryonic fibroblasts, mouse embryonic stem cells, and in mouse induced pluripotent stem cells. We have also shown that the dynamic nature of the chromatin epigenetic landscape plays a key role in the establishment and maintenance of cell identity, yet the factors that affect the dynamics of the epigenome are not fully known. We find that the ubiquitous nucleosome binding proteins HMGN1 and HMGN2 preferentially colocalize with epigenetic marks of active chromatin, and with cell-type specific enhancers. In a collaborattion, we performed ChIP followed by massively parallel sequencing (ChIPseq) against Mediator subunits from head (Med17), middle/tail (Med14), tail (Med15 and Med2), and CDK (Cdk8) modules in budding yeast. To allow better distinction of low levels of association from experimental noise or artifacts accompanying ChIP or library amplification prior to sequencing, we compared ChIP-seq profiles from wild type yeast to med17 ts yeast after 45 min at 37C. In yeast harboring this mutation, the head module is disrupted at 37C and mRNA transcription is greatly reduced genome-wide within 30 minutes. Furthermore, comparison of ChIP against Mediator subunits and Pol II in wild type and med17 ts yeast allowed detection of decreased association of Mediator and Pol II even at constitutively active promoters having relatively low amounts of Mediator association, while the relatively short temperature shift mitigates against the likelihood of indirect effects. We also compare association of Mediator subunits and Pol II in wild type and med3 med15 yeast, which lack two of the three subunits from the tail module triad of Med2/Med3/Med15, thus providing insight into the genome-wide function of the tail module in Mediator recruitment. These experiments are currently being extended with a new set of mutants to further understand the activities of the Mediator complex in gene regulation. We show that Mediator co-activator complex regulates Ty1 retromobility by controlling the balance between Ty1i and Ty1 promoters. In order to better understand intron retention in RNA-seq data, we have developed a new software application, TPMCalculator, to quantify mRNA abundance of genomic features. We have applied this software to the TCGA cancer genomic data and continue to interpret the results. All these analyses have resulted in a determination that data needs to be processed in a more organized way. The management of next generation sequencing (NGS) data produced by different technologies such as RNA-Seq, ChIP-Seq, ATAC-Seq and DNA-methylation is complex and demands advanced bioinformatics skills. For example, pre-processing quality control and sample selection based on principle component analysis (PCA) are tasks that should be easily available for researchers producing sequencing samples. In this work we present an open source containerized framework that is easily executed on most workstations for the integration and management of RNA-Seq, ChIP-Seq and ATAC-Seq data. The framework offers a user-friendly interface to execute the basic steps in data analysis allowing researchers a quick and straightforward evaluation of samples produced. The framework is comprised of a set of NGS data analysis workflows and pipelines in CWL format, a Python-Django back-end for data management and a set of Jupyter notebooks as user interface. Analysis reports with tables, figures and plots are automatically generated from data files with details and resolution ready for scientific publication. We are in the process of finalizing this project for publication.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Investigator-Initiated Intramural Research Projects (ZIA)
Project #
1ZIALM000084-22
Application #
10018391
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
22
Fiscal Year
2019
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
Zip Code
Salinero, Alicia C; Knoll, Elisabeth R; Zhu, Z Iris et al. (2018) The Mediator co-activator complex regulates Ty1 retromobility by controlling the balance between Ty1i and Ty1 promoters. PLoS Genet 14:e1007232
Ciftci-Yilmaz, Sultan; Au, Wei-Chun; Mishra, Prashant K et al. (2018) A Genome-Wide Screen Reveals a Role for the HIR Histone Chaperone Complex in Preventing Mislocalization of Budding Yeast CENP-A. Genetics 210:203-218
Li, Shan; Alvarez, Roberto Vera; Sharan, Roded et al. (2017) Quantifying deleterious effects of regulatory variants. Nucleic Acids Res 45:2307-2317
Zhang, Shaofei; Zhu, Iris; Deng, Tao et al. (2016) HMGN proteins modulate chromatin regulatory sites and gene expression during activation of naïve B cells. Nucleic Acids Res :
Deng, Tao; Zhu, Z Iris; Zhang, Shaofei et al. (2015) Functional compensation among HMGN variants modulates the DNase I hypersensitive sites at enhancers. Genome Res 25:1295-308
Paul, Emily; Zhu, Z Iris; Landsman, David et al. (2015) Genome-wide association of mediator and RNA polymerase II in wild-type and mediator mutant yeast. Mol Cell Biol 35:331-42
Yu, Weishi; McIntosh, Carl; Lister, Ryan et al. (2014) Genome-wide DNA methylation patterns in LSH mutant reveals de-repression of repeat elements and redundant epigenetic silencing pathways. Genome Res 24:1613-23
Deng, Tao; Zhu, Z Iris; Zhang, Shaofei et al. (2013) HMGN1 modulates nucleosome occupancy and DNase I hypersensitivity at the CpG island promoters of embryonic stem cells. Mol Cell Biol 33:3377-89
Hansen, Loren; Mariño-Ramírez, Leonardo; Landsman, David (2012) Differences in local genomic context of bound and unbound motifs. Gene 506:125-34
Rochman, Mark; Taher, Leila; Kurahashi, Toshihiro et al. (2011) Effects of HMGN variants on the cellular transcription profile. Nucleic Acids Res 39:4076-87

Showing the most recent 10 out of 17 publications