Despite substantial efforts in developing sequencing technologies and computational software, spanning over 30 years, the full genome of any but the simplest organisms is still unable to automatically reconstructed. The length of the DNA sequences that can be 'read'by modern sequencing systems is substantially smaller than the length of most genomes (1000s of base-pairs versus millions to billions), making it virtually impossible to use the fragmented information generated by the shotgun sequencing process to reconstruct the long-range information linking together genomic segments belonging to a same chromosome. The main reason why genome assembly is difficult is genomic repeats - segments of DNA that occur in multiple identical or near-identical copies throughout a genome. Any repeats longer than the length of a sequencing read introduce ambiguity in the possible reconstructions of a genome - an exponential (in the number of repeats) number of different genomes can be constructed from the same set of reads, among which only one is the true reconstruction of the genome being assembled. Finding this one correct genome from among the many possible alternatives is impossible without the use of additional information, such as mate-pair information constraining the relative placement of pairs of shotgun reads along the genome. Mate-pair information is routinely generated in sequencing experiments and has been critical to scientists'ability to reconstruct genomes from shotgun data (e.g., mate-pair information was crucial to the success of the first prokaryotic genome project - Haemophilus influenza). Given these outstanding issues, a series of interlocking aims is proposed that center on enhanced optical and electronic detection of specially-decorated, genomic DNA molecules.
The aims are designed for enabling new technologies that will provide sufficient physical map information to intimately mix with modern sequencing data for comprehensive assembly of complex genomes. These proposed advancements will be cradled within a new generation of nanofluidic devices engendering novel means for molecular control and detection. Such efforts will be directed by state-of-the art computer simulations that will model novel aspects of the new platforms for allowing rapid loops of design/implementation/testing. The main thrust of these technological developments will be carefully guided and serve a broad-based bioinformatics framework that will be developed for this work while laying the basis for highly integrated approaches to genome assembly and analysis.

Public Health Relevance

Development of new machines and software is proposed, which will rapidly analyze a person's genome and reveal new types of information that doctors will be able to use for treating patients. The machines that will be developed are actually very small devices that may one day be sufficiently miniaturized to fit in a person's hand.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-GGG-H (02))
Program Officer
Schloss, Jeffery
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Wisconsin Madison
Other Domestic Higher Education
United States
Zip Code
Manning, Viola A; Pandelova, Iovanna; Dhillon, Braham et al. (2013) Comparative genomics of a plant-pathogenic fungus, Pyrenophora tritici-repentis, reveals transduplication and the impact of repeat elements on pathogenicity and population divergence. G3 (Bethesda) 3:41-63
Ray, Mohana; Goldstein, Steve; Zhou, Shiguo et al. (2013) Discovery of structural alterations in solid tumor oligodendroglioma by single molecule analysis. BMC Genomics 14:505
van Heesch, Sebastiaan; Kloosterman, Wigard P; Lansu, Nico et al. (2013) Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing. BMC Genomics 14:257
Sarkar, Deepayan; Goldstein, Steve; Schwartz, David C et al. (2012) Statistical significance of optical map alignments. J Comput Biol 19:478-92
Kim, Yoori; Kim, Ki Seok; Kounovsky, Kristy L et al. (2011) Nanochannel confinement: DNA stretch approaching full contour length. Lab Chip 11:1721-9
Teague, Brian; Waterman, Michael S; Goldstein, Steven et al. (2010) High-resolution human genome structure by single-molecule analysis. Proc Natl Acad Sci U S A 107:10848-53
Antonacci, Francesca; Kidd, Jeffrey M; Marques-Bonet, Tomas et al. (2010) A large and complex structural polymorphism at 16p12.1 underlies microdeletion disease risk. Nat Genet 42:745-50
Zhang, Penghua; Too, Priscilla Hiu-Mei; Samuelson, James C et al. (2010) Engineering BspQI nicking enzymes and application of N.BspQI in DNA labeling and production of single-strand DNA. Protein Expr Purif 69:226-34
Yu, Hua; Jo, Kyubong; Kounovsky, Kristy L et al. (2009) Molecular propulsion: chemical sensing and chemotaxis of DNA driven by RNA polymerase. J Am Chem Soc 131:5722-3
Sambriski, E J; Schwartz, D C; de Pablo, J J (2009) A mesoscale model of DNA and its renaturation. Biophys J 96:1675-90

Showing the most recent 10 out of 41 publications