Computational Techniques for Advancing Untargeted Metabolomics Analysis

Hassoun, Soha

Abstract

Detecting and quantifying products of cellular metabolism using mass spectrometry (MS) has already shown great promise in biomarker discovery, nutritional analysis and other biomedical research fields. Despite recent advances in analysis techniques, our ability to interpret MS measurements remains limited. The biggest challenge in metabolomics is annotation, where measured compounds are assigned chemical identities. The annotation rates of current computational tools are low. For several surveyed metabolomics studies, less than 20% of all compounds are annotated. Another contributing factor to low annotation rates is the lack of systematic ways of designing a candidate set, a listing of putative chemical identities that can be used during annotation. Relying on exiting databases is problematic as considering the large combinatorial space of molecular arrangements, there are many biologically relevant compounds not catalogued in databases or documented in the literature. A secondary yet important challenge is interpreting the measurements to understand the metabolic activity of the sample under study. Current techniques are limited in utilizing complex information about the sample to elucidate metabolic activity. The goal of this project is to develop computational techniques to advance the interpretation of large-scale metabolomics measurements. To address current challenges, we propose to pursue three Aims: (1) Engineering candidate sets that enhance biological discovery. (2) Developing new techniques for annotation including using deep learning and incremental build out methods to recommend novel chemical structures that best explain the measurements. (3) Constructing probabilistic models to analyze metabolic activity. Each technique will be rigorously validated computationally and experimentally using chemical standards. Two detailed case studies on the intestinal microbiota will allow us to further validate our tools. Microbiota-derived metabolites have been detected in circulation and shown to engage host cellular pathways in organs and tissues beyond the digestive system. Identifying these metabolites is thus critical for understanding the metabolic function of the microbiota and elucidating their mechanisms. The complex test cases will challenge our techniques, provide feedback during development, and allow us to further disseminate our techniques. We will work closely with early adopters of our tools, as proposed in supporting letters, to further validate our tools and encourage wide adoption. All proposed tools will be open source and made accessible through the web. Our tools promise to change current practices in interpreting metabolomics data beyond what is currently possible with databases, current annotation tools, statistical and overrepresentation analysis, or combinations thereof. The use of machine learning and large data sets as proposed herein defines the most promising research direction in metabolomics analysis.

Public Health Relevance

Untargeted Metabolomics is a recently developed technique that allows the measurement of thousands of molecules in a biological sample. This work proposes several novel computational techniques that address limitations of current metabolomics analysis tools. We anticipate that this work will advance discoveries in biomedical research and have direct benefits to human health.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 3R01GM132391-01A1S1
Application #: 10145183
Study Section
Program Officer: Ravichandran, Veerasamy

Project Start: 2019-09-23
Project End: 2023-08-31
Budget Start: 2019-09-23
Budget End: 2020-08-31
Support Year: 1
Fiscal Year: 2020
Total Cost
Indirect Cost

Institution

Name: Tufts University
Department: Biostatistics & Other Math Sci
Type: Biomed Engr/Col Engr/Engr Sta
DUNS #: 073134835

City: Boston
State: MA
Country: United States
Zip Code: 02111

Related projects


NIH 2020 R01 GM	Computational Techniques for Advancing Untargeted Metabolomics Analysis Hassoun, Soha / Tufts University
NIH 2020 R01 GM	Computational Techniques for Advancing Untargeted Metabolomics Analysis Hassoun, Soha / Tufts University
NIH 2019 R01 GM	Computational Techniques for Advancing Untargeted Metabolomics Analysis Hassoun, Soha / Tufts University

Comments

Be the first to comment on Soha Hassoun's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: