This small grant for exploratory research will allow preliminary work on a novel, bioinformatics and mass spectrometry based de novo peptide sequencing method to sequence conserved proteins from taxa whose protein or genomic content has not been previously sequenced. The method will provide molecular phylogenetic hypotheses and potentially demonstrate peptide content in ancient fossils.

This information will help elucidate the phylogeny of uncharacterized taxa, and will provide useful information for biological and paleontological researchers. The strategy relies upon database searching of tandem mass spectra of peptides versus hypothetically generated predicted protein sequences based on known sequences for given proteins from neighboring taxa. Hundreds of predicted protein sequence database entries will be generated using bioinformatics based approaches such as sequence alignment, simple weighted consensus and point-assisted mutation matrices and automated for shareware on the web. Novel peptide sequences will be discovered through unambiguous matching of tandem mass spectra against the predicted protein sequences using database-scoring algorithms for mass spectrometry based proteomics such as Sequest. Novel sequences, unique to the taxon in question, will be rigorously validated by score cutoffs and manual inspection and then by synthesizing the resulting putative peptides for tandem mass spectral analysis as well as RNA sequencing of the test taxon, if possible. The analysis will be developed for several modern unsequenced taxa, including ostrich and alligator. The method will ultimately be used to sequence unique peptides from the soft tissue of the 68 million-year old Tyrannosaurus rex fossil found in Montana in 2003. This work will NOT (and in fact, cannot) result in the complete genetic sequencing of T. rex - we do not have a full set of T. rex proteins (the proteome), and if peptides have survived, they are separated from actual DNA code by levels of transcription and translation that are truly lost in time.

The advantage of this new approach to fossils is that very low levels of proteinaceous material, including modified proteins, can be sequenced in a high-throughput manner with highly sensitive and fast-scanning mass spectrometers. The method can be used as a means of accurately identifying the organisms of fossils once a database of many unique protein sequences has been compiled. It will also help researchers with non-extinct organisms whose genomic content has not been sequenced, and will contribute a concrete test of assumptions made about molecular clocks in peptides and proteins. In addition, examination of the fossil materials will open a new window into molecular-level taphonomy, and establish a precedent for the study of molecular phylogeny in dinosaurs.

Agency
National Science Foundation (NSF)
Institute
Division of Earth Sciences (EAR)
Application #
0634136
Program Officer
Paul E Filmer
Project Start
Project End
Budget Start
2006-10-01
Budget End
2008-09-30
Support Year
Fiscal Year
2006
Total Cost
$112,736
Indirect Cost
Name
Beth Israel Deaconess Medical Center
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02215