This grant proposes using a state of the art software integration platform to deliver cutting edge proteomics research to the scientific community. The PeptideAtlas is used by researchers and clinicians to explore the proteomes that are available using the current generation of mass spectrometers. Such exploration allows for the discovery of candidate diagnostic biomarkers, in particular those that involve post-translation modifications. In the future the importance of PeptideAtlas will grow, as proteomics evolves towards targeted approaches where specific subsets of proteins are searched for, and understanding of such prior experiments will become fundamental to all analyses. As this wealth of information becomes increasingly important, it must be readily accessible to the scientific community. To make this information, and associated tools, available to the researchers and clinicians requires a framework which supports: a means to portray the rich semantics required for proteomics;a high level of interoperability to support integration with numerous tools and processes;and a distributed system that support dynamic discovery so that the latest information is always available to the researcher. The caGRID offers such a distributed and interoperable framework, and so a natural progression for the Peptide Atlas is to exploit this technology to ensure its usage by the wider research and clinical community.

Public Health Relevance

Proteomics via mass spectrometry has become an important tool in our understanding of how the complex system, driven by our genome, functions at all levels. Thus, proteomics information is critical to the discovery and validation of disease and therapeutic biomarkers. This grant will make available to the research and clinical communities the single largest repository of mass spectrometry derived proteomics information. This project will deliver to the community knowledge about the proteomes from a large number of medically relevant species. This information will give direct access to the PeptideAtlas, and associated tools, and represents information gained from over five years of large scale tandem MS experiments. The information within the PeptideAtlas is of a high quality, is well annotated and is well organized.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-G (50))
Program Officer
Li, Jerry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Institute for Systems Biology
United States
Zip Code
Killcoyne, Sarah; Handcock, Jeremy; Robinson, Thomas et al. (2012) Interfaces to PeptideAtlas: a case study of standard data access systems. Brief Bioinform 13:615-26
Killcoyne, Sarah; Deutsch, Eric W; Boyle, John (2012) Mining PeptideAtlas for biomarkers and therapeutics in human disease. Curr Pharm Des 18:748-54
Lewis, Steven; Csordas, Attila; Killcoyne, Sarah et al. (2012) Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework. BMC Bioinformatics 13:324
Boyle, John; Kreisberg, Richard; Bressler, Ryan et al. (2012) Methods for visual mining of genomic and proteomic data atlases. BMC Bioinformatics 13:58
Robinson, Thomas; Killcoyne, Sarah; Bressler, Ryan et al. (2011) SAMQA: error classification and validation of high-throughput sequenced read data. BMC Genomics 12:419
Knijnenburg, Theo A; Lin, Jake; Rovira, Hector et al. (2011) EPEPT: a web service for enhanced P-value estimation in permutation tests. BMC Bioinformatics 12:411
Handcock, Jeremy; Deutsch, Eric W; Boyle, John (2010) mspecLINE: bridging knowledge of human disease with the proteome. BMC Med Genomics 3:7
Rovira, Hector; Killcoyne, Sarah; Shmulevich, Ilya et al. (2010) An integration architecture designed to deal with the issues of biological scope, scale and complexity. Lect Notes Comput Sci 6254:179-191
Burdick, David B; Cavnor, Chris C; Handcock, Jeremy et al. (2010) SEQADAPT: an adaptable system for the tracking, storage and analysis of high throughput sequencing experiments. BMC Bioinformatics 11:377
Boyle, John; Rovira, Hector; Cavnor, Chris et al. (2009) Adaptable data management for systems biology investigations. BMC Bioinformatics 10:79

Showing the most recent 10 out of 11 publications