Proteins are the primary functional molecules in living cells, and tandem mass spectrometry provides the most ef?cient means of studying proteins in a high-throughput fashion. The proposal aims to use state-of-the-art methods from the ?elds of machine learning, statistics, and natural language processing to improve our ability to make sense of large tandem mass spectrometry data sets. Our project will focus on three key problems in the analysis of such data: 1. facilitating the use of previously annotated spectra to improve our ability to annotate new spectra by creating a hybrid search scheme that compares an observed spectrum to a database comprised of theoretical spectra and previously annotated spectra, 2. enabling the ef?cient and accurate detection of peptides containing post-translational modi?cations and sequence variants, and 3. detecting sets of peptide species that are co-fragmented in the mass spectrometer and hence give rise to complex, mixture spectra. Each of these aims will improve the ability of mass spectrometrists to ef?ciently and accurately identify and quantify proteins in complex mixtures. To increase the impact of our work, we will continue to make all of our tools available as free software.
The applications of mass spectrometry, and its promises for improvements of human health, are numerous, including an increased understanding of disease phenotypes and the molecular mechanisms that underlie them, and vastly more sensitive and speci?c diagnostic and prognostic screens. However, making optimal use of mass spectrometry data requires sophisticated computational methods. This project will develop and apply novel statistical and machine learning methods for interpreting mass spectra.
|Keich, Uri; Noble, William Stafford (2018) Controlling the FDR in imperfect matches to an incomplete database. J Am Stat Assoc 113:973-982|
|Lin, Andy; Howbert, J Jeffry; Noble, William Stafford (2018) Combining High-Resolution and Exact Calibration To Boost Statistical Power: A Well-Calibrated Score Function for High-Resolution MS2 Data. J Proteome Res 17:3644-3656|
|Hu, Alex; Lu, Yang Young; Bilmes, Jeff et al. (2018) Joint Precursor Elution Profile Inference via Regression for Peptide Detection in Data-Independent Acquisition Mass Spectra. J Proteome Res :|
|Bittremieux, Wout; Meysman, Pieter; Noble, William Stafford et al. (2018) Fast Open Modification Spectral Library Searching through Approximate Nearest Neighbor Indexing. J Proteome Res 17:3463-3474|
|Ting, Ying S; Egertson, Jarrett D; Bollinger, James G et al. (2017) PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data. Nat Methods 14:903-908|
|Noble, William Stafford; Keich, Uri (2017) Response to ""Mass spectrometrists should search for all peptides, but assess only the ones they care about"". Nat Methods 14:644|
|Keich, Uri; Noble, William Stafford (2017) Progressive calibration and averaging for tandem mass spectrometry statistical confidence estimation: Why settle for a single decoy? Res Comput Mol Biol 10229:99-116|
|Sakano, Hitomi; Zorio, Diego A R; Wang, Xiaoyu et al. (2017) Proteomic analyses of nucleus laminaris identified candidate targets of the fragile X mental retardation protein. J Comp Neurol 525:3341-3359|
|May, Damon H; Tamura, Kaipo; Noble, William S (2017) Param-Medic: A Tool for Improving MS/MS Database Search Yield by Optimizing Parameter Settings. J Proteome Res 16:1817-1824|