Nuclear magnetic resonance spectroscopy (NMR)-based metabolomics is a powerful method for identifying metabolic perturbations that report on different biological states and sample types. Compared to mass spectrometry, NMR provides robust and highly reproducible quantitative data in a matter of minutes, which makes it very suitable for first-line clinical diagnostics. Although the metabolome is known to provide an instantaneous snap-shot of the biological status of a cell, tissue, and organism, the utilization of NMR in clinical practice is hindered by cumbersome data analysis. Major challenges include high-dimensionality of the data, overlapping signals, variability of resonance frequencies (chemical shift), non-ideal shapes of signals, and low signal-to-noise ratio (SNR) for low concentration metabolites. Existing approaches fail to address these challenges and sample analysis is time-consuming, manually done, and requires considerable knowledge of NMR spectroscopy. Recent developments in the field of sparse methods for machine learning and accelerated convex optimization for high dimensional problems, as well as kernel-based spatial clustering show promise at enabling us to overcome these challenges and achieve fully automated, operator-independent analysis. We are developing two novel, powerful, and automated algorithms that capitalize on these recent developments in machine learning.
In Aim 1, we describe ?NMRQuant? for automated identification and quantification of annotated metabolites irrespective of the chemical shift, low SNR, and signal shape variability.
In Aim 2, we describe ?SPA-STOCSY? for automated de-novo identification of molecular fragments of unknown, non- annotated metabolites. Based on substantial preliminary data, we propose to evaluate these algorithms' sensitivity, specificity, stability, and resistance to noise on phantom, biological, and clinical samples, comparing them to current methods. We will validate the accuracy of analyses by experimental 2D NMR, spike-in, and mass spectrometry. The proposed efforts will produce new NMR analytical software for discovery of both annotated and non-annotated metabolites, substantially improving accuracy and reproducibility of NMR analysis. Such analytical ability would change the existing paradigm of NMR-based metabolomics and provide an even stronger complement to current mass spectrometry-based methods. This approach, once thoroughly validated, will enable NMR to reach wide network of applications in biomedical, pharmaceutical, and nutritional research and clinical medicine.
This project seeks to develop an advanced and automated platform for identifying NMR metabolomics biomarkers of diseases and for fundamental studies of biological systems. When fully developed, these approaches could be used to detect small molecules in the blood or urine, indicative of the onset of various diseases, drug toxicity, or environmental effects on the organism.
|Guo, Caiwei; Jeong, Hyun-Hwan; Hsieh, Yi-Chen et al. (2018) Tau Activates Transposable Elements in Alzheimer's Disease. Cell Rep 23:2874-2880|
|Jeong, Hyun-Hwan; Yalamanchili, Hari Krishna; Guo, Caiwei et al. (2018) An ultra-fast and scalable quantification pipeline for transposable elements from next generation sequencing data. Pac Symp Biocomput 23:168-179|
|Raman, Ayush T; Pohodich, Amy E; Wan, Ying-Wooi et al. (2018) Apparent bias toward long gene misregulation in MeCP2 syndromes disappears after controlling for baseline variations. Nat Commun 9:3225|
|Kadur Lakshminarasimha Murthy, Preetish; Srinivasan, Tara; Bochter, Matthew S et al. (2018) Radical and lunatic fringes modulate notch ligands to support mammalian intestinal homeostasis. Elife 7:|
|Yi, Haidong; Raman, Ayush T; Zhang, Han et al. (2018) Detecting hidden batch factors through data-adaptive adjustment for biological effects. Bioinformatics 34:1141-1147|
|Ma, Zaijun; Wang, Hui; Cai, Yuping et al. (2018) Epigenetic drift of H3K27me3 in aging links glycolysis to healthy longevity in Drosophila. Elife 7:|
|Li, Biao; Sierra, Amanda; Deudero, Juan Jose et al. (2017) Multitype Bellman-Harris branching model provides biological predictors of early stages of adult hippocampal neurogenesis. BMC Syst Biol 11:90|
|Wang, Julia; Al-Ouran, Rami; Hu, Yanhui et al. (2017) MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome. Am J Hum Genet 100:843-853|
|Jin, Haijing; Wan, Ying-Wooi; Liu, Zhandong (2017) Comprehensive evaluation of RNA-seq quantification methods for linearity. BMC Bioinformatics 18:117|
|Yalamanchili, Hari Krishna; Wan, Ying-Wooi; Liu, Zhandong (2017) Data Analysis Pipeline for RNA-seq Experiments: From Differential Expression to Cryptic Splicing. Curr Protoc Bioinformatics 59:11.15.1-11.15.21|
Showing the most recent 10 out of 14 publications