Despite substantial efforts in developing sequencing technologies and computational software, spanning over 30 years, the full genome of any but the simplest organisms is still unable to automatically reconstructed. The length of the DNA sequences that can be 'read'by modern sequencing systems is substantially smaller than the length of most genomes (1000s of base-pairs versus millions to billions), making it virtually impossible to use the fragmented information generated by the shotgun sequencing process to reconstruct the long-range information linking together genomic segments belonging to a same chromosome. The main reason why genome assembly is difficult is genomic repeats - segments of DNA that occur in multiple identical or near-identical copies throughout a genome. Any repeats longer than the length of a sequencing read introduce ambiguity in the possible reconstructions of a genome - an exponential (in the number of repeats) number of different genomes can be constructed from the same set of reads, among which only one is the true reconstruction of the genome being assembled. Finding this one correct genome from among the many possible alternatives is impossible without the use of additional information, such as mate-pair information constraining the relative placement of pairs of shotgun reads along the genome. Mate-pair information is routinely generated in sequencing experiments and has been critical to scientists'ability to reconstruct genomes from shotgun data (e.g., mate-pair information was crucial to the success of the first prokaryotic genome project - Haemophilus influenza). Given these outstanding issues, a series of interlocking aims is proposed that center on enhanced optical and electronic detection of specially-decorated, genomic DNA molecules.
The aims are designed for enabling new technologies that will provide sufficient physical map information to intimately mix with modern sequencing data for comprehensive assembly of complex genomes. These proposed advancements will be cradled within a new generation of nanofluidic devices engendering novel means for molecular control and detection. Such efforts will be directed by state-of-the art computer simulations that will model novel aspects of the new platforms for allowing rapid loops of design/implementation/testing. The main thrust of these technological developments will be carefully guided and serve a broad-based bioinformatics framework that will be developed for this work while laying the basis for highly integrated approaches to genome assembly and analysis.

Public Health Relevance

Development of new machines and software is proposed, which will rapidly analyze a person's genome and reveal new types of information that doctors will be able to use for treating patients. The machines that will be developed are actually very small devices that may one day be sufficiently miniaturized to fit in a person's hand.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Smith, Michael
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Wisconsin Madison
Graduate Schools
United States
Zip Code
Krerowicz, Samuel J W; Hernandez-Ortiz, Juan P; Schwartz, David C (2018) Microscale Objects via Restructuring of Large, Double-Stranded DNA Molecules. ACS Appl Mater Interfaces :
Kounovsky-Shafer, Kristy L; Hernandez-Ortiz, Juan P; Potamousis, Konstantinos et al. (2017) Electrostatic confinement and manipulation of DNA molecules for genome analysis. Proc Natl Acad Sci U S A 114:13400-13405
Lequieu, Joshua; Schwartz, David C; de Pablo, Juan J (2017) In silico evidence for sequence-dependent nucleosome sliding. Proc Natl Acad Sci U S A 114:E9197-E9205
Lequieu, Joshua; Córdoba, Andrés; Schwartz, David C et al. (2016) Tension-Dependent Free Energies of Nucleosome Unwrapping. ACS Cent Sci 2:660-666
Li, Yang; Zhou, Shiguo; Schwartz, David C et al. (2016) Allele-Specific Quantification of Structural Variations in Cancer Genomes. Cell Syst 3:21-34
Mendelowitz, Lee M; Schwartz, David C; Pop, Mihai (2016) Maligner: a fast ordered restriction map aligner. Bioinformatics 32:1016-22
Park, Dong-Wook; Kim, Hyungsoo; Bong, Jihye et al. (2016) Flexible bottom-gate graphene transistors on Parylene C substrate and the effect of current annealing. Appl Phys Lett 109:152105
Lee, Seonghyun; Oh, Yeeun; Lee, Jungyoon et al. (2016) DNA binding fluorescent proteins for the direct visualization of large DNA molecules. Nucleic Acids Res 44:e6
Zhou, Shiguo; Goldstein, Steve; Place, Michael et al. (2015) A clone-free, single molecule map of the domestic cow (Bos taurus) genome. BMC Genomics 16:644
Hernández-Ortiz, Juan P; de Pablo, Juan J (2015) Self-consistent description of electrokinetic phenomena in particle-based simulations. J Chem Phys 143:014108

Showing the most recent 10 out of 60 publications