The goal of this project is to develop a rapid, cost-effective whole genome sequencing-based method to determine if a hospital-acquired infection (HAI) transmission event has occurred. HAIs, particularly those caused by multi-drug resistant (MDR) organisms, are recognized as a widespread challenge, affecting one in 25 patients treated in healthcare facilities. Yet there is currently no rapid and robust method for confirming when a suspected transmission has occurred: gold-standard approaches for determining relatedness of bacterial infections, such as pulsed-field gel electrophoresis, are not employed on a regular basis due to their high cost and slow turnaround time. Whole genome sequencing (WGS) has recently been successfully used in HAI investigations to provide high accuracy determination of clonality, yet the standard single nucleotide polymorphism (SNP) based computational method is slow and requires significant analysis by a skilled computational biologist. Thus WGS is currently not widely available as a tool for HAI determination. Day Zero Diagnostics is developing a computational method to analyze the WGS data of bacterial infections with a novel algorithm that rapidly computes the genetic relatedness of samples to determine if a transmission event has occurred. The method, called ksim, calculates the genomic similarity between samples by comparing their kmers (subsequences of length k in the WGS data). The method has several advantages over the SNP method: it is fast, taking only a few minutes to compare two samples; it is automated, not requiring trained personnel; and it is scalable to large datasets of thousands of samples. This method can be deployed as a service for hospital transmission investigations, and has the capability to become the basis for proactive identification of potential transmission events in large-scale ongoing sequencing efforts. The objective of this Phase I SBIR project is to further develop ksim, which has shown promising initial results but needs further optimization to yield robust results across different bacterial species.
Aim 1 will optimize the algorithm for 5 species commonly involved in MDR HAIs, refining ksim to distinguish between core and accessory genomic regions, and optimizing parameters using published hospital outbreak datasets.
Aim 2 will demonstrate a proof of concept validation of the ksim method, testing it on 3-5 new suspected hospital cluster cases, and demonstrating its applicability for proactive identification of outbreaks by determining the genetic relatedness of 5,000 clinical samples from a large single-hospital longitudinal WGS database. The resulting optimized algorithm will be useful for a large proportion of HAIs, and is readily expandable to additional pathogens through further development. The successful delivery of this proposal has the potential to make WGS easily available for routine use in transmission investigations, which would dramatically improve the capabilities of infection control procedures to manage and prevent HAIs.
Hospital acquired infections (HAI) are recognized as a widespread concern, affecting 4% of patients in the United States, and multidrug resistant pathogens can lead to particularly deadly hospital outbreaks, yet current methods for determining when a hospital transmission has occurred are slow, costly, or difficult to automate. This project will develop a novel computational method that uses whole genome sequencing data to rapidly identify transmission events. If successful, our rapid and fully-automated approach will provide cost-effective high resolution transmission information as soon as 24-hours from receipt of bacterial samples, a window of time that can have a major impact on the cost and magnitude of interventions a hospital might employ.