Robust genome sequencing technology has resulted in over 180 completed genomes, with sequencing projects for an additional 700+ organisms in progress. The difficult and important problem of experimentally determining the proteins encoded by these genomes lags far behind. We propose to complement existing messenger-RNA based approaches with high-throughput mass spectrometry of the entire protein complement of a complex animal, the nematode Caenorhabditis elegans. Our approach combines open- reading-frame (ORF) analysis of the fully sequenced C. elegans genome with high-throughput mass spectrometry, using multidimensional protein identification technology (MudPIT). Our long-term goal is development of these methods to the point that at least 80% of all proteins in a newly sequenced organism can be identified in a few months of concerted effort by a small group of investigators. This goal requires development of the following tools: 1) efficient evolutionary analysis of genomic ORFs to identify a computationally manageable set of candidate peptides for mass spectrum matching; 2) a robust method for biochemical fractionation of intact proteins from whole organisms or tissues; and 3) analytical approaches to assessing the significance of MudPIT matches to specific candidate peptides. Peptide cleavage, fractionation, and 2-dimensional (2D) mass spectrometry methods are established in our labs and are currently sufficient to achieve our goal with the addition of these tools. Our specific milestone for this 2-year grant period is the identification of at least 10,000 unique proteins (>50% of all predicted proteins in C. elegans) and validation by orthogonal methods of at least 50 of the proteins that are not yet supported by other data. The end result will be both an extensive map of the C. elegans proteome and a high-throughput pipeline that will allow similar analysis of any complex animal or plant proteome whose genome sequence is available. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21GM074787-02
Application #
7230204
Study Section
Enabling Bioanalytical and Biophysical Technologies Study Section (EBT)
Program Officer
Edmonds, Charles G
Project Start
2006-05-01
Project End
2009-04-30
Budget Start
2007-05-01
Budget End
2009-04-30
Support Year
2
Fiscal Year
2007
Total Cost
$151,394
Indirect Cost
Name
University of Washington
Department
Genetics
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Merrihew, Gennifer E; Davis, Colleen; Ewing, Brent et al. (2008) Use of shotgun proteomics for the identification, confirmation, and correction of C. elegans gene annotations. Genome Res 18:1660-9