Owing to recent technological advances in measurement platforms, it is now possible to simultaneously detect and characterize a very large number of metabolites covering a substantial fraction of the small molecules present in a biological sample. This presents an exciting opportunity to develop potentially transformative approaches to study cells and organisms. One major challenge in realizing this potential lies in processing and analyzing the data. A typical dataset from an untargeted experiment contains many of thousands of ?features,? each of which could correspond to a unique metabolite. Analyzing such datasets to obtain meaningful biological information depends on reliably and efficiently resolving the chemical identities of the detected features. Currently, in silico fragmentation methods predict candidate metabolites that are scored and ranked based on how well the fragmentation explains the observed MS/MS spectrum, and on other factors influencing fragmentation such as bond dissociation energies and ionization conditions. Deciding which candidate metabolites is the best match for a particular feature in the context of the biological sample, however, is a daunting task. Extensive testing of candidate metabolites against chemical standards library may be prohibitive in terms of cost and efforts. We seek to develop software-enabled workflows centered on resolving metabolite identities. Our approach is to exploit knowledge of the biological context of a sample to identify the metabolites. Recognizing that the metabolites present in a sample result from enzyme-catalyzed biochemical reactions active in the corresponding biological system, we employ topological analysis and inference to best map the metabolites implied by the detected features to metabolic pathways that are feasible based on the genome(s) of cells in the biological system.
Aim 1 develops a computational method based on Bayesian-inference to enhance candidate metabolite rankings that are obtained via in silico fragmentation analysis. Our method utilizes all available information (database lookups, in silico fragmentation analysis, and network/pathway context) to maximally inform and adjust the rankings.
Aim 2 will build software widgets to implement the metabolite identification workflow within a data-analytics framework. As the analytics framework, we will use Orange, which allows the user to create interactive data analysis pipelines through a plug-and-play graphical user interface (GUI).
Aim 3 will validate the computational method and software widget implementation. Experimental validation will utilize high-purity standards to confirm (or reject) the computationally assigned metabolite identities. Widget implementation will be evaluated through a focus group discussion with the widget users in the labs directed by the PIs. As project outcomes, we anticipate both a methodological advance in analyzing mass signature data as well as a suite of easily accessible software in the form of widgets.

Public Health Relevance

Metabolomics is concerned with the comprehensive characterization of the small molecule metabolites in biological systems. Owing to recent technological advances in measurement platforms, it is now possible to simultaneously detect and characterize a very large number of metabolites. Prospectively, advanced computational tools and software for metabolomics data analysis can aid discovery efforts aimed at identifying novel bioactive metabolites that could be developed into diagnostic indicators or therapeutic agents.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Small Research Grants (R03)
Project #
1R03CA211839-01
Application #
9223450
Study Section
Special Emphasis Panel (ZRG1-BST-U (50)R)
Program Officer
Spalholz, Barbara A
Project Start
2016-09-15
Project End
2017-08-31
Budget Start
2016-09-15
Budget End
2017-08-31
Support Year
1
Fiscal Year
2016
Total Cost
$147,569
Indirect Cost
$47,569
Name
Tufts University
Department
Engineering (All Types)
Type
Schools of Engineering
DUNS #
073134835
City
Medford
State
MA
Country
United States
Zip Code
02155
Stieglitz, Jessica T; Kehoe, Haixing P; Lei, Ming et al. (2018) A Robust and Quantitative Reporter System To Evaluate Noncanonical Amino Acid Incorporation in Yeast. ACS Synth Biol 7:2256-2269