Software for automated interpretation of heparan sulfate tandem mass spectra the expression of heparan sulfate (HS) is required for embryonic development and the functioning of every physiological system. Present in intracellular granules, on cell surfaces and in extracellular matrices, HS binds to growth factor families and their receptors. These binding interactions serve to modulate cellular responses to growth factor stimuli in a cell-type and developmental state specific manner. The challenge to exploitation of HS structures as drugs lies in the nature of their biosynthesis and in their chemical properties. HS chains are assembled in the endoplasmic reticulum and Golgi apparatus by a series of enzymes acting in a non-template driven manner. Chains are first polymerized and then subject to modification events that produce mature chains with a regulated domain structure overlaid by substantial heterogeneity. As the HS chain biosynthesis proceeds, the number of biosynthetic enzymatic isoforms increases. These enzymes, including 6O- sulfotransferases and 3O-sulfotransferases, are expressed in a tissue and cell-type specific manner and are believed to modify substrates with isoform specificity. The result is HS chains that have phenotype-specific structure and protein binding functions. Despite the availability of mouse mutants for many of the biosynthetic enzymes, progress in HS biomedicine has suffered from the lack of widely adopted sequencing methods. Over the past few years, however, electron activated dissociation methods (ExD) have been developed in mass spectrometry laboratories. These methods, including electron detachment dissociation (EDD) and negative electron detachment dissociation (NETD) demonstrate feasibility of instrumental sequencing of HS saccharides. The advantage to these methods is that they require no or minimal derivatization, are compatible with high throughput, and provide rich structural information of the HS saccharides. The tandem mass spectra are highly complex, however, and tailor-made bioinformatics methods are necessary to convert raw data into sequences. We have demonstrated feasibility of an algorithm (HS-SEQ) for sequencing HS saccharide from ExD tandem mass spectra. We now propose to develop HS-SEQ so that it can provide better performance and full features supporting automated HS analysis, and be easily used by the wider biomedical community. The availability of mass spectrometers with NETD or EDD capability will grow rapidly over the next few years. We will develop a pipeline for data processing that includes all steps necessary to go from raw data to sequence information. This pipeline will be designed for use by biomedical scientists familiar with HS biochemistry and/or proteomics methods. The HS- SEQ pipeline will run as a web service and be available in source code and binary installer form under a Creative Commons license
Despite the fact that heparan sulfate (HS) is necessary for all aspects of human physiology, understanding of relevant biological mechanisms suffers from the lack of widely disseminated sequencing methods. Recently developed HS sequencing methods remain the purview of experts due to lack of software for analysis of the data. We propose to develop a software pipeline tailored to the needs of non-experts for analysis of raw tandem mass spectral data on HS saccharides.