Nuclear magnetic resonance spectroscopy (NMR)-based metabolomics is a powerful method for identifying metabolic perturbations that report on different biological states and sample types. Compared to mass spectrometry, NMR provides robust and highly reproducible quantitative data in a matter of minutes, which makes it very suitable for first-line clinical diagnostics. Although the metabolome is known to provide an instantaneous snap-shot of the biological status of a cell, tissue, and organism, the utilization of NMR in clinical practice is hindered by cumbersome data analysis. Major challenges include high-dimensionality of the data, overlapping signals, variability of resonance frequencies (chemical shift), non-ideal shapes of signals, and low signal-to-noise ratio (SNR) for low concentration metabolites. Existing approaches fail to address these challenges and sample analysis is time-consuming, manually done, and requires considerable knowledge of NMR spectroscopy. Recent developments in the field of sparse methods for machine learning and accelerated convex optimization for high dimensional problems, as well as kernel-based spatial clustering show promise at enabling us to overcome these challenges and achieve fully automated, operator-independent analysis. We are developing two novel, powerful, and automated algorithms that capitalize on these recent developments in machine learning.
In Aim 1, we describe ?NMRQuant? for automated identification and quantification of annotated metabolites irrespective of the chemical shift, low SNR, and signal shape variability.
In Aim 2, we describe ?SPA-STOCSY? for automated de-novo identification of molecular fragments of unknown, non- annotated metabolites. Based on substantial preliminary data, we propose to evaluate these algorithms' sensitivity, specificity, stability, and resistance to noise on phantom, biological, and clinical samples, comparing them to current methods. We will validate the accuracy of analyses by experimental 2D NMR, spike-in, and mass spectrometry. The proposed efforts will produce new NMR analytical software for discovery of both annotated and non-annotated metabolites, substantially improving accuracy and reproducibility of NMR analysis. Such analytical ability would change the existing paradigm of NMR-based metabolomics and provide an even stronger complement to current mass spectrometry-based methods. This approach, once thoroughly validated, will enable NMR to reach wide network of applications in biomedical, pharmaceutical, and nutritional research and clinical medicine.

Public Health Relevance

This project seeks to develop an advanced and automated platform for identifying NMR metabolomics biomarkers of diseases and for fundamental studies of biological systems. When fully developed, these approaches could be used to detect small molecules in the blood or urine, indicative of the onset of various diseases, drug toxicity, or environmental effects on the organism.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM120033-03
Application #
9608754
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Ravichandran, Veerasamy
Project Start
2017-01-01
Project End
2021-12-31
Budget Start
2019-01-01
Budget End
2019-12-31
Support Year
3
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Baylor College of Medicine
Department
Pediatrics
Type
Schools of Medicine
DUNS #
051113330
City
Houston
State
TX
Country
United States
Zip Code
77030
Guo, Caiwei; Jeong, Hyun-Hwan; Hsieh, Yi-Chen et al. (2018) Tau Activates Transposable Elements in Alzheimer's Disease. Cell Rep 23:2874-2880
Jeong, Hyun-Hwan; Yalamanchili, Hari Krishna; Guo, Caiwei et al. (2018) An ultra-fast and scalable quantification pipeline for transposable elements from next generation sequencing data. Pac Symp Biocomput 23:168-179
Raman, Ayush T; Pohodich, Amy E; Wan, Ying-Wooi et al. (2018) Apparent bias toward long gene misregulation in MeCP2 syndromes disappears after controlling for baseline variations. Nat Commun 9:3225
Kadur Lakshminarasimha Murthy, Preetish; Srinivasan, Tara; Bochter, Matthew S et al. (2018) Radical and lunatic fringes modulate notch ligands to support mammalian intestinal homeostasis. Elife 7:
Yi, Haidong; Raman, Ayush T; Zhang, Han et al. (2018) Detecting hidden batch factors through data-adaptive adjustment for biological effects. Bioinformatics 34:1141-1147
Ma, Zaijun; Wang, Hui; Cai, Yuping et al. (2018) Epigenetic drift of H3K27me3 in aging links glycolysis to healthy longevity in Drosophila. Elife 7:
Li, Biao; Sierra, Amanda; Deudero, Juan Jose et al. (2017) Multitype Bellman-Harris branching model provides biological predictors of early stages of adult hippocampal neurogenesis. BMC Syst Biol 11:90
Wang, Julia; Al-Ouran, Rami; Hu, Yanhui et al. (2017) MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome. Am J Hum Genet 100:843-853
Jin, Haijing; Wan, Ying-Wooi; Liu, Zhandong (2017) Comprehensive evaluation of RNA-seq quantification methods for linearity. BMC Bioinformatics 18:117
Yalamanchili, Hari Krishna; Wan, Ying-Wooi; Liu, Zhandong (2017) Data Analysis Pipeline for RNA-seq Experiments: From Differential Expression to Cryptic Splicing. Curr Protoc Bioinformatics 59:11.15.1-11.15.21

Showing the most recent 10 out of 14 publications