The Computational Core will develop tools supporting the Experimental Core and creation of an informatic compound identification resource that greatly expands available biological and analytical characterization of xenobiotic compounds. This includes the development of computational tools that maximize information capture from the Experimental Core and uses this information to develop informatic compound identification resources for high-resolution mass spectrometry (HRMS) platforms. The Core is structured to deliver key analytics for mega-scale mass spectral data processing and improved workflows for chemical identification using high-throughput arrays. The team has extensive expertise in systems biology, computational metabolomics, multiomic integration, database management and HRMS spectral processing, and will leverage informatic and machine learning expertise in the NIEHS-funded HERCULES Environmental Health Data Sciences Core (EHDSC) at Emory University. The Computational Core will provide sustained impact for the Metabolomic Consortium through development of an open-source, platform independent software pipeline and cloud-based xenobiotic databases. Throughout the pipeline creation and implementation process we will work closely with the Metabolomics Consortium Stakeholder Engagement and Program Coordination Center (SEPCC) to provide consistent identification metrics and annotation best practices, in addition to eliciting feedback from the National Metabolomics Data Repository for maximizing synergy metabolomic datasets. Because of the unique needs of this project to develop improved algorithms for prediction of in silico biotransformation products and ion dissociation patterns and processing tools for the large amount of metabolite data generated by the Experimental Core, we have identified key milestones and deliverables to meet ECIDC objectives. These will be accomplished through aims designed to process MS/MS spectra for thousands to hundreds of thousands of metabolites generated by the Experimental Core in a time and cost- effective manner by developing a semi-automated workflow that combines visual scripting, computational prediction of enzymatic biotransformation products and MS/MS spectral deconvolution that utilizes correlation across samples to isolate high-purity dissociation patterns. We will build upon the mega-biotransformation- identification pipeline to 1) calibrate and enhance in silico prediction of biotransformation products using parent compounds, 2) calibrate and enhance in silico prediction MS/MS dissociation patterns, 3) LC retention time and adduct prediction tools for reducing false matches, 4) a combined cloud-based database containing experimental and predicted MS/MS spectral patterns for xenobiotics and metabolites, and 5) exposome-based metabolic pathway maps to rapidly assess xenobiotic exposure enrichment in human populations using untargeted, HRMS profiling data. These tools will be scalable to different instruments and number of samples to support the goal to provide mega-scale identification of xenobiotic metabolites.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Emory University
United States
Zip Code