DNA sequencing is currently in the midst of disruptive technological shifts, with 454, Illumina, and Solid providing us with enormous throughput increases and large reductions in cost per base. Massively parallel technologies deliver a few Gbp of sequence per week as short fragments, or reads. New applications of sequencing only recently considered impractical are enabled: personal genome sequencing, """"""""metagenomics"""""""" analysis of 'soups'containing several, to hundreds of unique organisms, and finally, de novo sequencing of novel genomes of complex organisms. No matter how the sequencing is done, reads must be assembled computationally, if they are to be useful. Given the read length and read quality limitations of new instruments and the massive volume of data generated, the computational assembly problem is becoming critical, with the cost of computational infrastructure and personnel exceeding reagent and instrument-related costs. Moreover, the results of assembly are currently far from ideal;for example, much of the human genome remains invisible due to high percentage of repeats. We propose to develop a new """"""""front end"""""""" to next-gen sequencers for DNA preparation, the """"""""Read-Cloud Method"""""""", which can reduce computational cost of genome assembly by 2-3 orders of magnitude, produce more complete and accurate genomes, and make metagenomics tractable. We propose a hierarchical sequencing approach, without any need for bacterial cloning. We will achieve this by handling single DNA molecules, tiled across the genome with high redundancy, on microfluidic devices. We will design, prototype, and thoroughly test technology to (i) shear genomic DNA into 200- kbp fragments with narrow size distributions;(ii) randomly amplify each individual, 200-kbp DNA in isolation, within a porous gel microcontainer that will be formed around the dsDNA molecule within a microdevice;(iii) digest micro-encapsulated DNA into small fragments, of tunable size;(iv) bar-code the progeny of each 200-kbp DNA with a 12mer oligonucleotide, to identify each read as associated with a particular 200-kbp DNA. A planar microfluidic device will be fabricated to allow one unique bar- code sequence to be blunt-end-ligated to both DNA termini. Bar-coded DNA is pooled, and next-gen sequencing is done. The results are a highly reducible data set. The method and algorithm are applicable universally, to next-generation platforms. The PIs (Batzoglou, Barron, Shaqfeh, Quake) will collaborate to make an efficient approach to hierarchical sequencing in microfluidic devices.

Public Health Relevance

Project Narrative Gene sequencing is important to medicine. Our DNA sequencing method has the potential for reducing computational cost by orders of magnitude while making the assembled genomes significantly more complete and accurate. The key to this step is using microfluidic handling technologies to subdivide genomic DNA into 200kbp fragments, which are then amplified in isolation from each other and uniquely-labeled to form a highly reducible dataset for genomic assembly.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
High Impact Research and Research Infrastructure Programs (RC2)
Project #
1RC2HG005596-01
Application #
7853052
Study Section
Special Emphasis Panel (ZHG1-HGR-N (O1))
Program Officer
Schloss, Jeffery
Project Start
2009-09-30
Project End
2011-07-31
Budget Start
2009-09-30
Budget End
2010-07-31
Support Year
1
Fiscal Year
2009
Total Cost
$732,668
Indirect Cost
Name
Stanford University
Department
Biomedical Engineering
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Albrecht, Jennifer Coyne; Kotani, Akira; Lin, Jennifer S et al. (2013) Simultaneous detection of 19 K-ras mutations by free-solution conjugate electrophoresis of ligase detection reaction products on glass microchips. Electrophoresis 34:590-7
Desmarais, Samantha M; Leitner, Thomas; Barron, Annelise E (2012) Quantitative experimental determination of primer-dimer formation risk by free-solution conjugate electrophoresis. Electrophoresis 33:483-91
Fredlake, Christopher P; Hert, Daniel G; Niedringhaus, Thomas P et al. (2012) Divergent dispersion behavior of ssDNA fragments during microchip electrophoresis in pDMA and LPA entangled polymer networks. Electrophoresis 33:1411-20
Wang, Xiaoxiao; Albrecht, Jennifer Coyne; Lin, Jennifer S et al. (2012) Monodisperse, ""highly"" positively charged protein polymer drag-tags generated in an intein-mediated purification system used in free-solution electrophoretic separations of DNA. Biomacromolecules 13:117-23
Kyriazopoulou-Panagiotopoulou, Sofia; Kashef Haghighi, Dorna; Aerni, Sarah J et al. (2011) Reconstruction of genealogical relationships with applications to Phase III of HapMap. Bioinformatics 27:i333-41
Lin, Jennifer S; Albrecht, Jennifer Coyne; Meagher, Robert J et al. (2011) Completely monodisperse, highly repetitive proteins for bioconjugate capillary electrophoresis: development and characterization. Biomacromolecules 12:2275-84
Ding, Sheng; Wang, Xiaoxiao; Barron, Annelise E (2011) Protein polymer: Gene libraries open up. Nat Mater 10:83-4
Albrecht, Jennifer Coyne; Kerby, Matthew B; Niedringhaus, Thomas P et al. (2011) Free-solution electrophoretic separations of DNA-drag-tag conjugates on glass microchips with no polymer network and no loss of resolution at increased electric field strength. Electrophoresis 32:1201-8
Albrecht, Jennifer Coyne; Lin, Jennifer S; Barron, Annelise E (2011) A 265-base DNA sequencing read by capillary electrophoresis with no separation matrix. Anal Chem 83:509-15
Niedringhaus, Thomas P; Milanova, Denitsa; Kerby, Matthew B et al. (2011) Landscape of next-generation sequencing technologies. Anal Chem 83:4327-41

Showing the most recent 10 out of 11 publications