? We have developed a computational pipeline to identify DNA sequence variations in the human genome. The pipeline uses the shotgun sequencing data that were generated by the SNP Consortium from 24 diverse humans (Nature 409, 928-933). Our preliminary data indicate that this collection of approximately 7.1 million shotgun sequencing reads is a rich resource for discovering a wide range of DNA polymorphisms in humans, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs) and transposon polymorphisms. The major goal of this proposal is to use our pipeline to identify additional DNA sequence variations and then perform verification experiments to measure the accuracy of our predictions. We will focus most of our efforts on identifying INDEL polymorphisms in humans. We also will identify and study the subclass of INDELs that are caused by de novo transposon insertions in humans. Such elements represent a source of human variation, and, as endogenous mutagens, may also be responsible for generating mutations that lead to human diseases. Finally, we will use our methods to identify INDELs that have occurred in the chimp genome relative to the human genome. Such INDELs will help us to identify additional forms of mobile DNA in chimps and humans, and also may provide insights on the genetic events leading to the speciation of these organisms. ? ?
Showing the most recent 10 out of 23 publications