Mapping the Secondary Metabolomes of Marine Cyanobacteria Bacteria are extraordinarily prolific sources of structurally unique and biologically active natural products that derive from a diversity of fascinating biochemical pathways. However, the complete structure elucidation of natural products is often the most time consuming and costly endeavor in natural product drug discovery programs. Compounding this, advancements in genome sequencing have accelerated the identification of unique modular biosynthetic gene clusters in prokaryotes and revealed a wealth of new compounds yet to be isolated and biologically and chemically characterized. Resultantly, there is an urgent and continuing need in this field to connect biosynthetic gene clusters to their respective MS fragmentation signatures in the MS2 molecular networks. The capacity to make such connections will accelerate new compound discovery as well as create associations between gene cluster and biosynthetic pathway, and aid in fast and accurate structure elucidations. Combined with this informatics approach, this proposed continuation project explores innovative methods by which to solve complex molecular structures by enhanced MS and NMR experiments, as well as the development of new algorithms by which to accelerate their analysis. Thus, the overarching goal of this grant is to develop efficient methods that facilitate automated structural classification, structural feature discovery and ultimately efficient structure elucidation of natural products (or any small molecule) and to build an infrastructure that interacts with data input from the community. We will achieve this with the following four specific aims:
Aim 1. Integration of MS2 molecular networking with gene cluster networking to rapidly and efficiently locate natural products that have unique molecular architectures;
Aim 2. To develop a suite of high sensitivity pulse sequences for natural product structure elucidation;
Aim 3. To develop NMR based molecular networking strategies using Deep Convolutional Neural Networks (DCNNs) to facilitate the categorization and structure elucidation of organic compounds;
Aim 4. To integrate NMR molecular networking and MS2-based molecular networking as an efficient structure characterization and elucidation strategy. By achieving these aims we will develop an innovative workflow for finding new compounds and for determining their structures, both quickly and accurately. The connection between gene cluster and molecule will shed light on stereochemistry and potential halogenations and methylations. This information can then be used in combination with more efficient NMR and MS methods to accurately determine structures. These tools will be widely shared, such as through the Global Natural Products Social (GNPS) Molecular Network, to enhance the overall capacity of the natural products and organic chemistry communities to solve complex molecular structures.
Natural products are compounds produced by natural sources and about 50 % of FDA approved drugs can trace their origin back to natural products. This proposal aims to use our data set of natural products produced by cyanobacteria for development of analytical tools that will speed- up and stream-line the discovery and structure elucidation of new compounds.
Showing the most recent 10 out of 22 publications