Recently, the emerging new field of metagenomics facilitated by the advent of next-generation sequencing technology enables genome sequencing of unculturable and often unknown microbes in natural environments, offering researchers an unprecedented opportunity to delineate bio-diversity of any microbial organism. Mining single nucleotide polymorphisms (SNPs) from metagenomic sequencing data offers an unique opportunity to rapidly and accurately detect known or novel strains related to multiple biothreat agents. While the sequencing technologies are evolving at unprecedented speed, researchers engaged in this enterprise are facing major computational, algorithmic and statistical challenges in the analysis of the massive metagenomic data. It is clear that both current analytical and computational methods are inadequate for this challenge. In this project, the investigator and his colleagues develop a family of statistically sound and computationally efficient algorithms to detect SNPs from metagenomic data to characterize microbial diversity in natural environments.
The proposed project provides the national security and biodefense agencies new tools for rapid and accurate detection of biothreat agents. It also provides researchers in microbiology with new tools for producing abundant, high throughput SNPs for detailed analysis of the genetic basis of microbial diversity and evolution. Since this informatics tool can be used to study a wide variety of microbial communities, it helps accelerating scientific advancements of our knowledge in microbiology and evolution. The multidisciplinary nature of the project will promote collaboration between biologists, computer scientists and statistician. The multidisciplinary nature of the project will also provide postdoctoral fellows and graduate students training in statistics, genomics and scientific computing through hands-on experience.