This Small Business Innovation Research (SBIR) Phase I project proposes to develop a system for automated classification of biological samples and discovery of biomarkers. The goal is a system to perform comprehensive pattern analysis of state-of-the-art biochemical separations generated by comprehensive two-dimensional gas chromatography (GCxGC) with high-resolution mass spectrometry (HRMS). A critical challenge for elective utilization of GCxGC-HRMS for biochemical classification and biomarker discovery is the diffculty of analyzing and interpreting the massive, complex data for metabolomic and proteomic features. The quantity and complexity of the data, as well as the large dimensionality of the biochemistry in which significant characteristics may be subtle and involve patterns of variations in multiple constituents, necessitate the investigation and development of new bioinformatics. The principal technical objective is an innovative framework for comprehensive feature matching and analysis across many samples. Feature matching is the basis for uniformly labeling structures so that similarities and differences can be documented. Specifically, the framework will incorporate advanced methods for multidimensional peak detection, peak pattern matching across large sample sets, data alignment, GCxGC-HRMS feature computations, and classification with large feature sets. The anticipated result is the technical foundation for a commercial system to classify biological samples and identify significant biomarkers.

The broader impact/commercial potential of this project, if successful, will be a better understanding of biochemical processes and discovery of metabolomic and proteomic biomarkers, leading to improved methods for disease diagnoses and treatments. These innovative bioinformatics will contribute to economic competitiveness in the global market for analytical technologies and will foster utilization of advanced GCxGC-HRMS instrumentation. The informatics developed in this project also will be relevant for other classification problems involving multidimensional, multispectral data, including other applications (such as biofuels),other types of chemical analyses (such as multidimensional spectroscopy), and other fields (such as remote-sensing multispectral geospatial imagers). The project will contribute to workforce development, by involving student interns in research experiences through internships and project sponsorships, and to education, by providing software and example data to allow students to more easily explore biochemical complexity.

Project Report

This NSF SBIR Phase I project investigated and developed an informatics framework for automated classification of biological samples and discovery of biomarkers with new state-of-the-art biochemical separation instruments. Comprehensive two-dimensional gas chromatography (GCxGC) with high-resolution mass spectrometry (HRMS) provides more effective molecular separations than traditional GC-MS and GCxGC-MS, with precise elemental analysis, but generates data in significantly greater quantity and complexity. The new bioinformatics developed by the project directly address the critical need for software that can discover significant metabolomic and proteomic biomarkers from patterns of subtle chemical variations in complex biological samples. Based on research supported with this NSF SBIR award, the company, GC Image, LLC, developed and tested a proof-of-concept prototype of a framework for comprehensive feature generation, matching, and analysis with large sample sets of GCxGC-HRMS data. This initial framework and prototype methods are incorporated in a commercial product, GC Image GCxGC-HRMS Software. The software packaged and sold with the Zoex FasTOF, the first commercial GCxGC-HRMS instrument, provides the tools to meet rudimentary needs of researchers. This software will be the foundation for a new product that transforms complex, multi-sample GCxGC-HRMS data into metabolomic information with significant value in biomedical research for market innovations such as improved diagnostic tests, new drugs, and more effective treatment regimes. The power of GCxGC-HRMS, supported by effective information technologies, will enable researchers to improve their understanding of biomolecular processes in health, disease, and treatment and realize significant value in biomedical research and product development.

Project Start
Project End
Budget Start
2010-07-01
Budget End
2011-06-30
Support Year
Fiscal Year
2010
Total Cost
$174,643
Indirect Cost
Name
Gc Image, LLC
Department
Type
DUNS #
City
Lincoln
State
NE
Country
United States
Zip Code
68508