The effectiveness of shotgun proteomics for samples from cancer samples has been curtailed by two key challenges. First, cancer is often accompanied by deficits in apoptosis, cell division, and DNA repair as well as inflammatory responses;these changes lead to hetergeneity in proteins due to mutation and chemical modification. Peptides that differ from reference sequences in databases are not identified by the standard database search algorithms. Second, the degree of homology among proteins in human sequence databases introduces significant problems when identified peptides are assembled to produce protein identifications;a peptide may be an exact match to dozens of protein sequences, leading to an amplification in the number of proteins reported by researchers. We propose an integrated set of algorithms designed to address these shortcomings. First, we will develop """"""""sequence tagging"""""""" software to infer partial sequences from tandem mass spectra by repurposing research in database search algorithms. Second, we will create algorithms to reconcile partial peptide matches to these spectra in order to identify peptides that vary from reference sequences by mutations and modifications. Third, we will develop a modular framework for assembling these peptide identifications into proteins that will incorporate estimated false positive rates and multiple forms of peptides. The algorithm will apply clustering technologies in the application of parsimony rules to reduce effects of database homology in protein list reporting. These open-source tools will be developed using standard file formats and be supported by code documentation to promote their widespread use. Proteomics can potentially make powerful contributions to clinical diagnosis and research, but the bioinformatics that enable this technology have critical shortcomings that prevent its efficient translation from a research tool to a clinical tool. We propose new systems for improving proteomics'application to clinical samples by improving identification of modified and mutant protein forms and managing sets of related proteins documentation to promote their widespread use.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-SRRB-9 (O1))
Program Officer
Rodriguez, Henry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
Schools of Medicine
United States
Zip Code
Wang, Xiaojing; Zhang, Bing (2014) Integrating genomic, transcriptomic, and interactome data to improve Peptide and protein identification in shotgun proteomics. J Proteome Res 13:2715-23
Wang, Xiaojing; Liu, Qi; Zhang, Bing (2014) Leveraging the complementary nature of RNA-Seq and shotgun proteomics data. Proteomics 14:2676-87
Holman, Jerry D; Dasari, Surendra; Tabb, David L (2013) Informatics of protein and posttranslational modification detection via shotgun proteomics. Methods Mol Biol 1002:167-79
Tabb, David L (2012) Evaluating protein interactions through cross-linking mass spectrometry. Nat Methods 9:879-81
Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang et al. (2012) Refining comparative proteomics by spectral counting to account for shared peptides and multiple search engines. Anal Bioanal Chem 404:1115-25
Holman, Jerry D; Ma, Ze-Qiang; Tabb, David L (2012) Identifying proteomic LC-MS/MS data sets with Bumbershoot and IDPicker. Curr Protoc Bioinformatics Chapter 13:Unit13.17
Dasari, Surendra; Chambers, Matthew C; Martinez, Misti A et al. (2012) Pepitome: evaluating improved spectral library search for identification complementarity and quality assessment. J Proteome Res 11:1686-95
Ma, Ze-Qiang; Polzin, Kenneth O; Dasari, Surendra et al. (2012) QuaMeter: multivendor performance metrics for LC-MS/MS proteomics instrumentation. Anal Chem 84:5845-50
Martinez, Melissa N; Emfinger, Christopher H; Overton, Matthew et al. (2012) Obesity and altered glucose metabolism impact HDL composition in CETP transgenic mice: a role for ovarian hormones. J Lipid Res 53:379-89
Chambers, Matthew C; Maclean, Brendan; Burke, Robert et al. (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30:918-20

Showing the most recent 10 out of 33 publications