The goal of this SBIR application is to develop innovative technologies to provide more complete whole genome sequences, which accurately identify all genetic variants (single nucleotide, insertions/deletions, polyploidy, structural variants) and phase the variants to the appropriate homologous chromosome. The approach for solving these long standing challenges in genomics is tremendously significant and will transform the way genomes are sequenced and analyzed. The basic approach is to stretch many individual DNA molecules on an oligonucleotide chip. These DNA molecules will be primed by the oligonucleotides on the chip, which will serve as templates for traditional Next Generation Sequencing (NGS). The oligonucleotide on the chip is barcoded to identify the location on the chip, and thus provide a scaffold for assembling the short NGS reads (i.e. from an Illumina HiSeq). Scaffolding the short reads will solve key problems in genomics by allowing for high quality de novo assembly with very accurate single nucleotide variant detection, structure variant detection, and resolution of haplotypes from diploid samples.
The aims of the proposal are to address the critical challenges in this process. First, in Aim 1 we will optimize the DNA chip fabrication to (a) reduce the feature size and pitch, (b) reverse the orientation of the oligos to make 3' end free for extension, and (c) increase the length and accuracy of the oligosynthesis. Secondly, in Aim 2 we will develop the approach for combing chromosomal DNA on oligonucleotide chip surfaces. We have experience combing DNA on traditional glass surfaces; however, the different surface properties of the DNA chip will likely pose new challenges for the DNA combing. Finally, in Aim 3 we will generate sequencing libraries from the immobilized DNA and sequence the libraries on an Illumina HiSeq. We will also sequence the barcodes on the Illumina platform, which will provide a scaffold for assembling the short reads. In summary, this project will integrate several highly innovative and breakthrough technologies to address major limitations of current next generation sequencing.

Public Health Relevance

We will develop innovative technologies to provide a more complete human genome sequence, and allow genome studies to accurately identify all variants and phase them to the appropriate homologous chromosome by assembling accurate reads of megabase length. Ultimately, our technologies will decrease the cost of whole genome sequencing while dramatically increasing the accuracy and completeness of the results.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
1R43HG008582-01
Application #
8906555
Study Section
Special Emphasis Panel (ZRG1-IMST-J (15))
Program Officer
Smith, Michael
Project Start
2015-04-07
Project End
2017-03-31
Budget Start
2015-04-07
Budget End
2016-03-31
Support Year
1
Fiscal Year
2015
Total Cost
$350,000
Indirect Cost
Name
Centrillion Biosciences, Inc.
Department
Type
DUNS #
018448334
City
Palo Alto
State
CA
Country
United States
Zip Code
94303