Primary purpose of this proposal is to provide the applicant with the means and structures for achieving two goals; (1) to develop intelligent computational aids for mining proteomic data accumulating from high throughput techniques like SELDI-TOF mass spectrometry; and (2) the long-term goal is to gain independence as a biomedical informatics researcher by developing methodological expertise in Bayesian methods and proteomic technologies. Applicant will obtain further instruction in probabilistic methods of data analysis; and she will receive education on proteomic technologies that are driving today's proteome research. Training will be provided through formal coursework, directed readings, seminars and conferences in addition to research directed by excellent mentors. Applicant's research project involves a novel combination of techniques for use in proteomic data analysis. Previous research has included the use of techniques such as genetic algorithms and neural networks for analysis of proteomic data. These techniques were not explicitly designed to take into account background and prior knowledge. Hypothesis of this project is that background knowledge and machine learning techniques can positively influence the selection of appropriate biomarkers from proteomic data, enabling efficient and accurate analysis of massive datasets arising from proteomic profiling studies. Therefore, this project will satisfy four aims: (1) development of a wrapper-based machine learning tool; (2) augment the tool with prior knowledge such as heuristic rules and relationships in the data; (3) use these features along with de-identified patient information as input to classification systems; and (4) evaluate existing techniques for interpreting tandem mass spectrometry (MS-MS or MS/MS) data, and propose, implement and evaluate a Bayesian method for identification of peptides and proteins indicated by the MS-MS spectrum.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Mentored Quantitative Research Career Development Award (K25)
Project #
5K25GM071951-05
Application #
7460715
Study Section
Special Emphasis Panel (ZRG1-BDMA (01))
Program Officer
Whitmarsh, John
Project Start
2004-07-01
Project End
2009-06-30
Budget Start
2008-07-01
Budget End
2009-06-30
Support Year
5
Fiscal Year
2008
Total Cost
$132,674
Indirect Cost
Name
University of Pittsburgh
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
004514360
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Grover, Himanshu; Wallstrom, Garrick; Wu, Christine C et al. (2013) Context-sensitive markov models for peptide scoring and identification from tandem mass spectrometry. OMICS 17:94-105
Lustgarten, Jonathan L; Visweswaran, Shyam; Gopalakrishnan, Vanathi et al. (2011) Application of an efficient Bayesian discretization method to biomedical data. BMC Bioinformatics 12:309
Ryberg, Henrik; An, Jiyan; Darko, Samuel et al. (2010) Discovery and verification of amyotrophic lateral sclerosis biomarkers by proteomics. Muscle Nerve 42:104-11
Gopalakrishnan, Vanathi; Lustgarten, Jonathan L; Visweswaran, Shyam et al. (2010) Bayesian rule learning for biomedical data mining. Bioinformatics 26:668-75
Lustgarten, Jonathan L; Gopalakrishnan, Vanathi; Visweswaran, Shyam (2009) Measuring stability of feature selection in biomedical datasets. AMIA Annu Symp Proc 2009:406-10
Lustgarten, Jonathan L; Visweswaran, Shyam; Bowser, Robert P et al. (2009) Knowledge-based variable selection for learning rules from proteomic data. BMC Bioinformatics 10 Suppl 9:S16
Ranganathan, Srikanth; Williams, Eric; Ganchev, Philip et al. (2005) Proteomic profiling of cerebrospinal fluid identifies biomarkers for amyotrophic lateral sclerosis. J Neurochem 95:1461-71