This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. The Boston University Bioinformatics Program requires first year graduate students to participate in year-long """"""""Challenge Projects"""""""" in which they work with academic collaborators on significant bioinformatics problems. During the 2009-10 academic year, Prof. Zaia participated in such a project concerning developing software for targeted analysis of glycomics liquid chromatography mass spectrometry data sets. The primary challenges to the analysis of such data sets are (1) that glycan liquid chromatography peaks are not as sharp as those commonly obtained for peptides in proteomics;(2) that glycan elemental compositions and stable isotope envelopes differ significantly from those or peptides;and (3) effective analysis of glycan data sets is best accomplished using full resolution, or profile mode, data, requiring large file sizes. Software tools designed for proteomics often fail because of factors (1) and/or (2), above. Metabolomics tools are often designed for analysis of compounds less than 1000 Da in size for which low resolution bar graph mass spectra are adequate. Thus, such algorithms often fail because of factor (3) above. The theoretical glycan compositions for a given class may be easily calculated. Typically a few hundred compositions defines the biological space for a given glycan class. Given this, glycomics liquid chromatography mass spectrometry data may be queried effectively by extracting m/z values corresponding to each theoretical glycan composition. This method mirrors that used for manual extraction of glycan composition and abundance information used in publications from the group. The informatics team worked during the year on this project and developed an algorithm that reliably extracts information from the glycomics datasets that runs rapidly on a laptop computer.
Showing the most recent 10 out of 253 publications