Indiana University is given a CAREER award for Dr. Haixu Tang to pursue algorithm and software development for Mass-Spectra-based glycomics, the study of identifying and characterizing all functional oligosaccharides inside a cell. Currently there are three typical protocols to analyze glycoproteins: oligosaccharides are released from proteins and then are analyzed by tandem or even higher MS to elucidate their structures; glycoproteins are enriched and then identified by tandem MS using typical proteomics methodologies without addressing glycans explicitly; and glycoproteins are purified and analyzed by MS/MS to characterize and profile the microheterogeneity of site-specific glycosylations. Glycomics projects require intensive computational analysis of the MS spectra. The objective of this proposal is to develop computational methods for solving several fundamental problems emerging from MS-based glycomics projects, and to develop urgently needed courses and tutorials for educating students and researchers in glycoinformatics. It consists of three specific aims: (1) to continue developing algorithms and software for characterizing oligosaccharide structures, including their sequences, branchings and linkages, from tandem mass spectra; (2) to develop algorithms and software for characterizing site-specific glycosylations in glycoproteins from their LC/MS/MS analyses; and (3) to develop new courses in glycoinformatics and proteome informatics. The success of these goals will provide useful software tools and teaching materials to the glycomics community. Furthermore, it will enable the automation of glycoproteomics analysis of complex sample. The proposed research is highly related to the PI's efforts for the multidisciplinary training for graduate students, particularly in the areas of glycome and proteome informatics, in the School of Informatics at Indiana University.

Project Report

Carbohydrates (or glycans) inside a living cell go beyond the role of a major energy resource. The essential functions of glycans and glycoconjugates (glycoproteins and glycolipids) in many cellular events have been well known. As laid out by a recent National Academy of Sciences (NAS) report titled "Transforming Glycoscince: a Roadmap to the Future", glycoscince, the research of glycans and glycoconjugates are critial for three areas: 1) the understanding of human health and disease; 2) the search for alternating source of energy; and 3) the development of new materials. All these three research areas demands new technologies to characterize the structures of glycans and glycoconjugates that are either naturally present or artificial. The NAS report emphasizes that "a suite of tools … is needed to detect, describe, and purify glycans from natural sources, and characterize their chemical composition and structure". In the past decade, high throughput technologies, in particular mass spectrometry techniques, have been developed to characterize glycans and glycoconjugates at low cost, which constitute the new field of glycomics. However, as in the other -omics areas, the applications of these techniques require the development of user-friendly software tools to assist the automated interpretation of experimental data, because in nature a large amount of data are generated by these high throughput techniques. This project aims to address this issue by developing software packages for the structure characterization of glycans and glycopeptides from high throughput mass spectrometric data. During the funding period, we have developed two software suites, MultiGlycan for the identification and quantification of glycans, and GlycoFragwork for the identification and quantification of intact glycopeptides, from mass spectrometic data. Both software packages are currently released as free software to the public (at http://darwin.informatics.indiana.edu/MultiGlycan/ and http://darwin.informatics.indiana.edu/col/GlycoFragwork/). We will continue supporting and improving these tools to be used by the glycobiology research community. We have also applied these tools on the analysis of complex human blood samples, identifying over 100 intact glycopeptides among that several putative glyans and glycopeptides were discovered linked to human diseases. Based on these findings, we will carry out follow-up research on a larger pool of samples. The software tools likely have impact on many disciplines where MS-based high throughput glycomic techniques will be applied. We will seek industrial partners (e.g., MS instrument vendors) to continue developing these tools, and make them available to a broader research community that may utilize these glycomic techniques, but may not be familiar with their concepts.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
0642897
Program Officer
Peter H. McCartney
Project Start
Project End
Budget Start
2007-06-01
Budget End
2013-05-31
Support Year
Fiscal Year
2006
Total Cost
$593,622
Indirect Cost
Name
Indiana University
Department
Type
DUNS #
City
Bloomington
State
IN
Country
United States
Zip Code
47401