Our goal is to discover and use relationships between mouse and human regulatory genomes to advance the ENCODE Project in its effort to map all functional elements in the human genome. Our comparative approach aims to uncover principles and solve problems that are proving difficult by studying the human genome alone. ENCODE is vigorously mapping hundreds of function-associated biochemical markers in selected cell lines, resulting already in tens of millions of reproducible biochemical features. Some observed protein:DNA interactions find and refine known transcriptional enhancers, promoters, silencers, together with associated chromatin structure, as was anticipated. But substantial questions arise as to how many of the myriad biochemical events are functional, what those functions are, which gene or genes are meaningful targets, etc. To highlight and sort functionally important biochemical marks from others, we will systematically identify the molecular events retained by both mouse and human since they diverged. We will then analyze how conservation of biochemical features relates to conservation of DNA sequence and conservation of regulated gene expression. By using the mouse, we can leverage decades of molecular genetics and manipulated mouse genomes that do not exist in any other mammal.
In Aim 1 we execute genome-wide assays for biochemical signatures of functional DNA sequences in a few specific mouse cell types. By using well-studied mouse lines and cell states, we can interpret results in light of previously validated elements and in light of ENCODE human results. We will use ENCODE standards for high throughput, sequence-based assays to determine gene expression, DNase hypersensitive sites, histone modifications and selected transcription factor occupancy in seven mouse cell types. The eight selected features are the most informative ones for function, and thus most useful for comparison with human data.
In Aim 2, we apply a genome-wide implementation of chromosome conformation capture to map the interactions between transcription factor binding sites and their responsive genes in two cell types. These results will be compared to those from an ENCODE developmental project. Comparative analysis in Aim 3 will insure that the impact of the data we produce will go beyond the individual mouse cell systems per se. To do this we have organized a collaboration of investigators at multiple institutions, in which each group is expert in one or more critical aspects. Our data, made public and accessible via ENCODE, will fuel and accelerate many future studies after the 2-yr stimulus both in and beyond ENCODE. This responds to NHGRI request for applications on """"""""Enhancement of the value of the human ENCODE Project by conducting a parallel effort on the mouse genome."""""""" The proposed work will improve the maps of biologically functional DNA sequences in humans, which in turn will help explain how variants in human genome sequences could be associated with human diseases, leading to candidates for novel avenues for effective therapy and prevention.

Public Health Relevance

Every person differs in his or her response to pathogens and in the likelihood that they will suffer from complex diseases such as cancer, heart disease or diabetes. Individual susceptibility to disease is determined in part by genetics, and we can map with high precision the locations of DNA variants associated with disease susceptibility. In order to understand how these variants contribute to disease susceptibility, we need to identify the biological functions of all DNA sequences;the proposed work will help us map these functional DNA sequences.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
High Impact Research and Research Infrastructure Programs (RC2)
Project #
3RC2HG005573-02S1
Application #
8321719
Study Section
Special Emphasis Panel (ZHG1-HGR-M (O1))
Program Officer
Feingold, Elise A
Project Start
2009-09-26
Project End
2013-07-31
Budget Start
2011-08-01
Budget End
2013-07-31
Support Year
2
Fiscal Year
2011
Total Cost
$749,992
Indirect Cost
Name
Pennsylvania State University
Department
Biochemistry
Type
Schools of Arts and Sciences
DUNS #
003403953
City
University Park
State
PA
Country
United States
Zip Code
16802
Zhang, Yu; An, Lin; Yue, Feng et al. (2016) Jointly characterizing epigenetic dynamics across multiple human cell types. Nucleic Acids Res 44:6721-31
Han, G Celine; Vinayachandran, Vinesh; Bataille, Alain R et al. (2016) Genome-Wide Organization of GATA1 and TAL1 Determined at High Resolution. Mol Cell Biol 36:157-72
Stonestrom, Aaron J; Hsu, Sarah C; Jahn, Kristen S et al. (2015) Functions of BET proteins in erythroid gene expression. Blood 125:2825-34
Makova, Kateryna D; Hardison, Ross C (2015) The effects of chromatin organization on variation in mutation rates in the genome. Nat Rev Genet 16:213-23
Denas, Olgert; Sandstrom, Richard; Cheng, Yong et al. (2015) Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution. BMC Genomics 16:87
Hsiung, Chris C-S; Morrissey, Christapher S; Udugama, Maheshi et al. (2015) Genome accessibility is widely preserved and locally modulated during mitosis. Genome Res 25:213-25
Dogan, Nergiz; Wu, Weisheng; Morrissey, Christapher S et al. (2015) Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility. Epigenetics Chromatin 8:16
Jain, Deepti; Mishra, Tejaswini; Giardine, Belinda M et al. (2015) Dynamics of GATA1 binding and expression response in a GATA1-induced erythroid differentiation system. Genom Data 4:1-7
Paralkar, Vikram R; Mishra, Tejaswini; Luan, Jing et al. (2014) Lineage and species-specific long noncoding RNAs during erythro-megakaryocytic development. Blood 123:1927-37
Wu, Weisheng; Morrissey, Christapher S; Keller, Cheryl A et al. (2014) Dynamic shifts in occupancy by TAL1 are guided by GATA factors and drive large-scale reprogramming of gene expression during hematopoiesis. Genome Res 24:1945-62

Showing the most recent 10 out of 37 publications