This application addresses broad Challenge Area (06) Enabling Technologies and specific Challenge Topic 06-HG-103: Methods to sequence highly variable, repeat-rich regions of complex genomes. A remarkable 45% of our genome consists of repetitive elements - more than 40-fold the mass contribution of protein-coding sequences. Retrotransposons termed long interspersed elements (LINEs) are among the most predominant and dynamic of these. More than 500,000 LINEs, both intact 6kb elements and fragments, comprise 17% of the human genome. Their presence is intriguing because LINEs are major forces in the evolution of mammalian genomes, with the potential to significantly alter neighboring gene expression levels and mRNA structure. The youngest LINEs, known as T(a)LINEs (transcriptionally active LINEs), retain retrotransposition activity, creating significant genetic differences across human populations and inherited disease by germ-line integration, as well as somatic transforming mutations in cancer. Each of our genomes harbors about 500 T(a)LINEs, approximately 100 of which are intact, autonomous transposons capable of """"""""copy-and-paste"""""""" retrotransposition. Finally, their insertion into both coding and noncoding regions has been associated with a wide variety of functional effects, implicating them as a potentially major source of human phenotypic diversity. There is a fundamental lack of understanding surrounding the role of retrotransposons in human disease largely because of the massive numbers of LINEs that exist in our genomes, as well as their large size. Of necessity, LINEs have been excluded from array based copy number variation studies and next generation whole genome sequencing efforts. My laboratory recently published a method to map repetitive elements in S. cerevisiae by a coupled vectorette PCR-microarray method we have termed transposon insertion profiling (TIP-Chip). We have demonstrated that this technology enables mapping of human T(a)LINEs, and propose the first major comprehensive survey of T(a)LINEs in reference DNA samples to begin to characterize this underexplored aspect of our genomes. Much of our DNA is derived from LINE retrotransposons. The youngest family of these elements, T(a)LINEs, remain mobile and are poorly characterized, major sources of genomic structural variation across human demographics. Moreover, their insertion into both coding and noncoding regions has been associated with a wide variety of functional effects, implicating them as a potentially major source of human phenotypic diversity. A newly developed method developed in this laboratory for identifying locations of T(a)LINEs will be exploited to comprehensively map these sequences in 120 reference DNA samples and generate a public database of insertional sites and frequencies. This effort will add a new dimension to our understanding of the human genome and provide the basis for future biomedical research investigating the impact of transposable elements on human health and disease.

Public Health Relevance

Much of our DNA is derived from LINE retrotransposons. The youngest family of these elements, T(a)LINEs, remain mobile and are poorly characterized, major sources of genomic structural variation across human demographics. Moreover, their insertion into both coding and noncoding regions has been associated with a wide variety of functional effects, implicating them as a potentially major source of human phenotypic diversity. A newly developed method developed in this laboratory for identifying locations of T(a)LINEs will be exploited to comprehensively map these sequences in 120 reference DNA samples and generate a public database of insertional sites and frequencies. This effort will add a new dimension to our understanding of the human genome and provide the basis for future biomedical research investigating the impact of transposable elements on human health and disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
NIH Challenge Grants and Partnerships Program (RC1)
Project #
5RC1HG005359-02
Application #
7936352
Study Section
Special Emphasis Panel (ZRG1-GGG-F (58))
Program Officer
Felsenfeld, Adam
Project Start
2009-09-22
Project End
2012-08-31
Budget Start
2010-09-01
Budget End
2012-08-31
Support Year
2
Fiscal Year
2010
Total Cost
$497,006
Indirect Cost
Name
Johns Hopkins University
Department
Biochemistry
Type
Schools of Medicine
DUNS #
001910777
City
Baltimore
State
MD
Country
United States
Zip Code
21218
Gnanakkan, Veena P; Jaffe, Andrew E; Dai, Lixin et al. (2013) TE-array--a high throughput tool to study transposon transcription. BMC Genomics 14:869
Burns, Kathleen H; Boeke, Jef D (2012) Human transposon tectonics. Cell 149:740-52
Huang, Cheng Ran Lisa; Schneider, Anna M; Lu, Yunqi et al. (2010) Mobile interspersed repeats are major structural variants in the human genome. Cell 141:1171-82