The specific objective of this proposal is to create a method for quickly identifying the species composition of organisms present in high-throughput DNA sequencing data. The main hypothesis is that every organism has a unique k-mer frequency vector that can be constructed from the organism's genome to quickly identify the organism in heterogeneous DNA sequencing samples using methods from linear algebra and statistics. The goals of this research are to build a computational framework for storing and manipulating k-mer frequency vectors, develop a regression model for identifying organisms and estimating their abundance from heterogeneous, short-read DNA sequencing samples, and apply this method for city-wide pathogen detection as part of the New York City """"""""PathoMap"""""""" project.

Public Health Relevance

This project is relevant to public health as the results will be useful in research efforts where a clear understanding of species composition in complex microbial communities is needed such as those that pose potential pathogenic threats to humans that interact with these communities. In addition, much research is focused on investigating the role that microbial organisms play in regulating health and disease inside the human body. These microbiome studies can be better facilitated with the development of accurate and efficient computational methods such as the one presented in this proposal.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Predoctoral Individual National Research Service Award (F31)
Project #
1F31GM111053-01
Application #
8721157
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Gaillard, Shawn R
Project Start
2014-07-01
Project End
2016-06-30
Budget Start
2014-07-01
Budget End
2015-06-30
Support Year
1
Fiscal Year
2014
Total Cost
Indirect Cost
Name
Weill Medical College of Cornell University
Department
Type
Graduate Schools
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10065
MetaSUB International Consortium (2016) The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium inaugural meeting report. Microbiome 4:24
Rosenfeld, Jeffrey A; Reeves, Darryl; Brugler, Mercer R et al. (2016) Genome assembly and geospatial phylogenomics of the bed bug Cimex lectularius. Nat Commun 7:10164
Kolokotronis, Sergios-Orestis; Foox, Jonathan; Rosenfeld, Jeffrey A et al. (2016) The mitogenome of the bed bug Cimex lectularius (Hemiptera: Cimicidae). Mitochondrial DNA B Resour 1:425-427
Afshinnekoo, Ebrahim; Meydan, Cem; Chowdhury, Shanin et al. (2015) Geospatial Resolution of Human and Bacterial Diversity with City-Scale Metagenomics. Cell Syst 1:72-87