? We have developed a computational pipeline to identify DNA sequence variations in the human genome. The pipeline uses the shotgun sequencing data that were generated by the SNP Consortium from 24 diverse humans (Nature 409, 928-933). Our preliminary data indicate that this collection of approximately 7.1 million shotgun sequencing reads is a rich resource for discovering a wide range of DNA polymorphisms in humans, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs) and transposon polymorphisms. The major goal of this proposal is to use our pipeline to identify additional DNA sequence variations and then perform verification experiments to measure the accuracy of our predictions. We will focus most of our efforts on identifying INDEL polymorphisms in humans. We also will identify and study the subclass of INDELs that are caused by de novo transposon insertions in humans. Such elements represent a source of human variation, and, as endogenous mutagens, may also be responsible for generating mutations that lead to human diseases. Finally, we will use our methods to identify INDELs that have occurred in the chimp genome relative to the human genome. Such INDELs will help us to identify additional forms of mobile DNA in chimps and humans, and also may provide insights on the genetic events leading to the speciation of these organisms. ? ?

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genome Study Section (GNM)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Emory University
Schools of Medicine
United States
Zip Code
Connolly, Nina P; Shetty, Amol C; Stokum, Jesse A et al. (2018) Cross-species transcriptional analysis reveals conserved and host-specific neoplastic processes in mammalian glioma. Sci Rep 8:1180
Gardner, Eugene J; Lam, Vincent K; Harris, Daniel N et al. (2017) The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology. Genome Res 27:1916-1929
Scott, Emma C; Devine, Scott E (2017) The Role of Somatic L1 Retrotransposition in Human Cancers. Viruses 9:
Scott, Emma C; Gardner, Eugene J; Masood, Ashiq et al. (2016) A hot L1 retrotransposon evades somatic repression and initiates human colorectal cancer. Genome Res 26:745-55
Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J et al. (2015) An integrated map of structural variation in 2,504 human genomes. Nature 526:75-81
Nugent, Bridget M; Wright, Christopher L; Shetty, Amol C et al. (2015) Brain feminization requires active repression of masculinization via DNA methylation. Nat Neurosci 18:690-7
1000 Genomes Project Consortium; Auton, Adam; Brooks, Lisa D et al. (2015) A global reference for human genetic variation. Nature 526:68-74
Delaneau, Olivier; Marchini, Jonathan; 1000 Genomes Project Consortium et al. (2014) Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nat Commun 5:3934
Colonna, Vincenza; Ayub, Qasim; Chen, Yuan et al. (2014) Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences. Genome Biol 15:R88
Khurana, Ekta; Fu, Yao; Colonna, Vincenza et al. (2013) Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 342:1235587

Showing the most recent 10 out of 23 publications