Repetitive transposable elements (TEs) comprise over 50% of the human genome. While some investigators regard TEs as ?parasitic? DNA, other studies suggest that TEs play a more constructive role in genome evolution by providing raw material for new biological functions. For example, TEs commonly harbor active cis-regulatory elements that are occasionally co-opted during evolution to wire new gene regulatory networks. While investigators now recognize the importance of TEs in gene regulation, TEs remain under-analyzed in high-throughput data because of methodological hurdles associated with their repetitive nature. Thus, the impact of TEs on the regulation of the human genome, both in normal development and disease, remains largely uncharacterized. We propose to develop novel computational methods to assess and clarify the impact of TEs in regulatory innovation using ENCODE data.
In Specific Aim 1 we will develop new algorithms and statistical methods to predict active regulatory elements encoded by TEs from heterogeneous ENCODE data. If successful, we will generate a profile of TE-derived regulatory elements and their predicted targets across diverse cell/tissue types and developmental stages, revealing new gene regulatory networks wired by TEs. With these new methods we also intend to examine the extent of TE dysregulation in cancer cells and its transcriptional consequences.
In Specific Aim 2 we will extend the models developed in Aim 1 to understand the role of TEs in shaping the 3D topology of the genome, which is intimately connected to genome function. We will investigate the role of TEs in partitioning the genome into chromosomal domains that orchestrate communication between cis-regulatory elements and their target genes. In particular, we will quantify the extent to which TEs drive conservation and divergence in genome topology across mammal species.
In Specific Aim 3 we will take advantage of the repetitive nature of TEs to develop a novel statistical model that links sequence changes in different copies of TEs to epigenetic and functional differences. The numerous, but slightly different copies of a TE present in a single genome provide a unique opportunity to identify sequence variants that underlie epigenetic modification, which will further our understanding of how TEs become co-opted for host gene regulation. Finally, in Specific Aim 4, we will deploy our recently developed Repeat Element Browser as a web portal and downloadable application specifically tailored for investigators to analyze, visualize and explore data produced by ENCODE, others, and their own data in the context of TEs. The methods developed in this proposal will have a high impact on the utility of the data produced by ENCODE and will greatly expand our understanding of the contribution of TEs to non-coding regulatory elements in healthy tissues and disease.

Public Health Relevance

Transposable elements (TE) are a special class of short DNA sequences that copy and paste themselves to new locations in the genome. Through repeated copying and pasting, TEs now comprise over 50% of the human genome sequence. When TEs paste themselves near genes they can have profound effects on the way those genes are regulated, both in health and disease. Despite their importance TEs remain poorly characterized. The same property that makes them special, namely their ability to copy and paste across the genome, makes them highly repetitive and therefore recalcitrant to large-scale analyses, such as the ENCODE project. To address this problem we propose to develop a new set of computational methods to profile the regulatory activity of human TEs across anatomical and developmental space, taking advantage of comparisons between human and mouse to study the impact of TEs in regulatory evolution. We will use this comprehensive profile generated from healthy cells and tissues to identify TE mis-regulation in disease, including cancer, and its regulatory consequences.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project--Cooperative Agreements (U01)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1)
Program Officer
Gilchrist, Daniel A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Washington University
Schools of Medicine
Saint Louis
United States
Zip Code
Wang, Yanli; Song, Fan; Zhang, Bo et al. (2018) The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol 19:151
Sundaram, Vasavi; Wang, Ting (2018) Transposable Element Mediated Innovation in Gene Regulatory Landscapes of Cells: Re-Visiting the ""Gene-Battery"" Model. Bioessays 40:
Cheng, Cheng; Deng, Pan-Yue; Ikeuchi, Yoshiho et al. (2018) Characterization of a Mouse Model of Börjeson-Forssman-Lehmann Syndrome. Cell Rep 25:1404-1414.e6
Zhang, Chengkang; Lee, Hyung Joo; Shrivastava, Anura et al. (2018) Long-Term In Vitro Expansion of Epithelial Stem Cells Enabled by Pharmacological Inhibition of PAK1-ROCK-Myosin II and TGF-? Signaling. Cell Rep 25:598-610.e5
Jiang, Kaiyu; Wong, Laiping; Chen, Yanmin et al. (2018) Soluble inflammatory mediators induce transcriptional re-organization that is independent of dna methylation changes in cultured human chorionic villous trophoblasts. J Reprod Immunol 128:2-8
Xing, Xiaoyun; Zhang, Bo; Li, Daofeng et al. (2018) Comprehensive Whole DNA Methylome Analysis by Integrating MeDIP-seq and MRE-seq. Methods Mol Biol 1708:209-246
Agrawal, A; Chou, Y-L; Carey, C E et al. (2018) Genome-wide association study identifies a novel locus for cannabis dependence. Mol Psychiatry 23:1293-1302
Dai, Xiaoyu; Lin, Nan; Li, Daofeng et al. (2018) A non-randomized procedure for large-scale heterogeneous multiple discrete testing based on randomized tests. Biometrics :
Zhu, Liangliang; Yan, Feihu; Wang, Zhen et al. (2018) Genome-wide DNA methylation profiling of primary colorectal laterally spreading tumors identifies disease-specific epimutations on common pathways. Int J Cancer 143:2488-2498
Shen, Jie; Wang, Cuicui; Li, Daofeng et al. (2017) DNA methyltransferase 3b regulates articular cartilage homeostasis by altering metabolism. JCI Insight 2:

Showing the most recent 10 out of 20 publications