The correct functioning of many proteins depends on glycosylation, the addition of sugar molecules (glycans) to selected amino acids in the protein. For example, cancer cells have different glycosylation patterns than ordinary cells, and there is strong evidence that glycoproteins on the surface of egg cells play an essential role in sperm binding. Despite the importance of glycosylation, there are as yet no reliable, high-throughput methods for determining the identity and location of glycans. Glycan identification is currently a manual procedure for experts, involving a combination of chemical assays and mass spectrometry. The automation of the process would have a significant impact on our understanding of this important biological process. The proposed project aims to invent chemical procedures, algorithms, and software for high-throughput analysis of glycan mass spectrometry data. The goal is to bring glycan analysis up to the level of peptide analysis within 3 years. In contrast to peptide analysis, which can leverage genomics data, glycan analysis requires the incorporation of expert knowledge of synthetic pathways, in order to limit the huge number of theoretical combinations of monosaccharides to the much smaller number that are actually synthesized in nature. The project will have to develop novel representations for the evolving expert knowledge, because an exhaustive list- analogous to the human genome- is not currently known. Along with expert knowledge, the project will develop and validate machine learning and statistical techniques for glycan identification. In particular, the project will develop methods for internally calibrating spectra, and will learn fragmentation patterns that can statistically distinguish different types of glycosidic linkages. ? ?

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM074128-03
Application #
7239477
Study Section
Special Emphasis Panel (ZRG1-BPC-Q (02))
Program Officer
Edmonds, Charles G
Project Start
2005-06-01
Project End
2010-05-31
Budget Start
2007-06-01
Budget End
2008-05-31
Support Year
3
Fiscal Year
2007
Total Cost
$318,904
Indirect Cost
Name
Palo Alto Research Center
Department
Type
DUNS #
112219014
City
Palo Alto
State
CA
Country
United States
Zip Code
94304
Goldberg, David; Bern, Marshall; North, Simon J et al. (2009) Glycan family analysis for deducing N-glycan topology from single MS. Bioinformatics 25:365-71
Goldberg, David; Bern, Marshall; Parry, Simon et al. (2007) Automated N-glycopeptide identification using a combination of single- and tandem-MS. J Proteome Res 6:3995-4005
Goldberg, David; Bern, Marshall; Li, Bensheng et al. (2006) Automatic determination of O-glycan structure from fragmentation spectra. J Proteome Res 5:1429-34
Comelli, Elena M; Sutton-Smith, Mark; Yan, Qi et al. (2006) Activation of murine CD4+ and CD8+ T lymphocytes leads to dramatic remodeling of N-linked glycans. J Immunol 177:2431-40