The Specific Aim of this Phase II SBIR proposal is to develop an affordable commercial quality product for improved metagenomic sequencing and culture-free microbial discovery. With development of high-throughput sequencing technologies, we now have the capacity to sequence entire genomes of cultured microorganisms. However, we only have a limited capacity to sequence pooled genomes (microbiomes) through metagenomic sequencing, which includes sequencing the genomes of microorganisms that currently cannot be cultured. Using conventional protocols, the association of DNA fragments from the same species is lost during the DNA preparation process (cell lysis, DNA purification and shearing). It is therefore near-impossible to assign any specific DNA sequence to its origin without relying on a priori knowledge. To overcome this hurdle, we have adapted the Hi-C proximity-ligation tool to join DNA molecules that are physically proximal to one another within an intact cell. Our successful Phase I studies showed the feasibility of developing a commercial product for the construction of high-quality Hi-C libraries from fecal, soil, and clinical samples. Deep sequencing of these DNA junctions enables us to reconstruct complete or near-complete genomes and deconvolute mixed strains without any culturing. We have successfully applied this technology for deconvolution and assembly of artificially mixed populations of microorganisms that included various fungal, bacterial, and archaeal species as well as a number of real-world metagenomic samples. The method has proven to be highly accurate and efficient, including associating multiple chromosomes and plasmids with their host microorganism. While feasible, in order to become a commercial product we must develop our laboratory methods into commercial quality high-throughput assay kits and finalize software development. To achieve our Specific Aim, we will carry out the following Tasks: Task 1: Optimize current Hi-C kit protocols; Task 2: Develop, test and manufacture 96-well metagenomic Hi-C kits; Task 3: Algorithm development and software optimization for analyzing Hi-C data; Task 4: Produce a customer-facing website used for data analysis. Criteria for Success: A successful product would have to reduce the laboratory workflow to under 24 hours of prep time, be simple enough to allow multiplexing, and bring our cost of goods down to $50 per sample or less. The method would need to work on a wide array of metagenomic sample types, such as fecal, clinical, and environmental samples containing a variety of diverse microbes. Our kit will consist of enzymes and buffers to generate an Illumina-compatible Hi-C library from a raw sample such as 0.25 grams of soil or 50uL of fecal material with a <10% failure rate by users with moderate levels of experience. At least ten clients must independently validate our kits. The accompanying software will be able to assemble known genomes with at least 95% accuracy and generate at least 20 novel genomes per sample with >90% completeness and <5% contamination as measured by the CheckM tool.

Public Health Relevance

Our bodies and the environment are dominated by communities of microorganisms (microbiota) that have profound effects on our health, agriculture, environment, and diverse industrial processes. However, we only have a limited capacity to sequence pooled genomes (microbiomes), which includes sequencing the genomes of microorganisms that currently cannot be cultured. Metagenomic sequencing has already proven itself highly useful, especially in the areas of health benefits and human diseases. It has provided insight into plant microbiomes, with implications for enhancing ecologically friendly agricultural management, and it has provided access to genomes of uncharted microorganisms that can be exploited for industrial processes. This is a proposal to overcome current limitations in sequencing that will allow any reasonably well trained molecular biologist to sequence even the most complex microbiomes with state of the art results.

Agency
National Institute of Health (NIH)
Institute
National Institute of Allergy and Infectious Diseases (NIAID)
Type
Small Business Innovation Research Grants (SBIR) - Phase II (R44)
Project #
5R44AI122654-03
Application #
9719736
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Brown, Liliana L
Project Start
2016-06-12
Project End
2020-05-31
Budget Start
2019-06-01
Budget End
2020-05-31
Support Year
3
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Phase Genomics, Inc.
Department
Type
DUNS #
079752735
City
Seattle
State
WA
Country
United States
Zip Code
98105