At least half of the human genome is derived from transposable elements (TEs). While some investigations regard TEs as ?parasitic? DNA, other studies suggest that TEs play a more constructive role in genome evolution by providing raw material for new biological functions. TEs commonly harbor active cis- regulatory elements that are occasionally co-opted during evolution to wire new gene regulatory networks. TEs remain under-analyzed in high-throughput data because of methodological hurdles associated with their repetitive nature. Thus, the impact of TEs on the regulation of the human genome, both in normal development and disease, remains largely uncharacterized. We propose to develop advanced genomics approaches to assess and clarify the impact of TEs in regulatory innovation, conservation, and in human diseases.
In Aim 1 we combine a novel statistical framework with massively parallel reporter gene assays to understand TE sequence features that contribute to gene regulation. We will take advantage of the repetitive nature of TEs to link sequence changes in different copies of TEs to epigenetic and functional differences, and test their regulatory activities using a new genome integrated massive parallel reporter gene assay.
In Aim 2 we will extend the models developed in Aim 1 to understand the role of TEs in shaping the 3D topology of the genome, which is intimately connected to genome function. We will quantify the extent to which TEs underlie the conservation and/or divergence of genome topology across mammalian species.
In Aim 3 we will develop technologies to detect TE-gene fusions linked to disease.
We aim to detect cases where epigenetically de-repressed TEs initiate transcripts that splice into downstream genes, resulting in TE-gene fusion chimeric RNA and protein products. We will develop tools to detect such TE-gene fusion transcripts, and will adapt CRISPR-based genetic and epigenetic tools in order to manipulate TEs, which will allow us to establish whether TEs play a causal role in this type of abnormal gene activity.
In Aim 4 we will test the hypothesis that epigenetic inhibitors commonly used for therapeutics alter TEs? epigenetic regulation. Through the aims of this proposal we hope to develop an understanding of what sequence features drive the regulatory potential of TEs, and the modes of evolution followed by different families of TEs during regulatory network evolution. Such an understanding will improve our picture of regulatory network evolution by including the effects of TEs, a major class of fast evolving sequences that have been largely ignored in functional genomics studies. The methods developed in this proposal will have a high impact on the utility of data produced by consortia such as ENCODE, Roadmap, TCGA, and other large-scale projects, which currently discard most TE derived sequences from their data. Such improvement will in turn accelerate research into understanding the impact of TEs? on normal gene regulation and in human diseases.

Public Health Relevance

Transposable elements (TE) are a special class of short DNA sequences that copy and paste themselves to new locations in the genome. These elements comprise a large fraction of the DNA in mammalian genomes, including 50% of the human genome. Because of their repetitive nature they are difficult to study and are generally discarded in most genomics studies. When TEs insert themselves near genes they can have profound effects on the way those genes are regulated, both in health and disease. We recently developed genomics tools to study these elements and showed that TEs often carry regulatory sequences that are co-opted by genomes to perform normal gene regulation. We propose to investigate the extent to which TEs contribute to normal gene regulation throughout the genome and how dysregulation of TE derived sequences contributes to disease.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Felsenfeld, Adam
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Washington University
Schools of Medicine
Saint Louis
United States
Zip Code
Wang, Yanli; Song, Fan; Zhang, Bo et al. (2018) The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol 19:151
Sundaram, Vasavi; Wang, Ting (2018) Transposable Element Mediated Innovation in Gene Regulatory Landscapes of Cells: Re-Visiting the ""Gene-Battery"" Model. Bioessays 40:
Cheng, Cheng; Deng, Pan-Yue; Ikeuchi, Yoshiho et al. (2018) Characterization of a Mouse Model of Börjeson-Forssman-Lehmann Syndrome. Cell Rep 25:1404-1414.e6
Zhang, Chengkang; Lee, Hyung Joo; Shrivastava, Anura et al. (2018) Long-Term In Vitro Expansion of Epithelial Stem Cells Enabled by Pharmacological Inhibition of PAK1-ROCK-Myosin II and TGF-? Signaling. Cell Rep 25:598-610.e5
Jiang, Kaiyu; Wong, Laiping; Chen, Yanmin et al. (2018) Soluble inflammatory mediators induce transcriptional re-organization that is independent of dna methylation changes in cultured human chorionic villous trophoblasts. J Reprod Immunol 128:2-8
Xing, Xiaoyun; Zhang, Bo; Li, Daofeng et al. (2018) Comprehensive Whole DNA Methylome Analysis by Integrating MeDIP-seq and MRE-seq. Methods Mol Biol 1708:209-246
Agrawal, A; Chou, Y-L; Carey, C E et al. (2018) Genome-wide association study identifies a novel locus for cannabis dependence. Mol Psychiatry 23:1293-1302
Dai, Xiaoyu; Lin, Nan; Li, Daofeng et al. (2018) A non-randomized procedure for large-scale heterogeneous multiple discrete testing based on randomized tests. Biometrics :
Zhu, Liangliang; Yan, Feihu; Wang, Zhen et al. (2018) Genome-wide DNA methylation profiling of primary colorectal laterally spreading tumors identifies disease-specific epimutations on common pathways. Int J Cancer 143:2488-2498
Zhou, Jia; Sears, Renee L; Xing, Xiaoyun et al. (2017) Tissue-specific DNA methylation is conserved across human, mouse, and rat, and driven by primary sequence conservation. BMC Genomics 18:724

Showing the most recent 10 out of 39 publications