Primary purpose of this proposal is to provide the applicant with the means and structures for achieving two goals; (1) to develop intelligent computational aids for mining proteomic data accumulating from high throughput techniques like SELDI-TOF mass spectrometry; and (2) the long-term goal is to gain independence as a biomedical informatics researcher by developing methodological expertise in Bayesian methods and proteomic technologies. Applicant will obtain further instruction in probabilistic methods of data analysis; and she will receive education on proteomic technologies that are driving today's proteome research. Training will be provided through formal coursework, directed readings, seminars and conferences in addition to research directed by excellent mentors. Applicant's research project involves a novel combination of techniques for use in proteomic data analysis. Previous research has included the use of techniques such as genetic algorithms and neural networks for analysis of proteomic data. These techniques were not explicitly designed to take into account background and prior knowledge. Hypothesis of this project is that background knowledge and machine learning techniques can positively influence the selection of appropriate biomarkers from proteomic data, enabling efficient and accurate analysis of massive datasets arising from proteomic profiling studies. Therefore, this project will satisfy four aims: (1) development of a wrapper-based machine learning tool; (2) augment the tool with prior knowledge such as heuristic rules and relationships in the data; (3) use these features along with de-identified patient information as input to classification systems; and (4) evaluate existing techniques for interpreting tandem mass spectrometry (MS-MS or MS/MS) data, and propose, implement and evaluate a Bayesian method for identification of peptides and proteins indicated by the MS-MS spectrum.
Grover, Himanshu; Wallstrom, Garrick; Wu, Christine C et al. (2013) Context-sensitive markov models for peptide scoring and identification from tandem mass spectrometry. OMICS 17:94-105 |
Lustgarten, Jonathan L; Visweswaran, Shyam; Gopalakrishnan, Vanathi et al. (2011) Application of an efficient Bayesian discretization method to biomedical data. BMC Bioinformatics 12:309 |
Gopalakrishnan, Vanathi; Lustgarten, Jonathan L; Visweswaran, Shyam et al. (2010) Bayesian rule learning for biomedical data mining. Bioinformatics 26:668-75 |
Ryberg, Henrik; An, Jiyan; Darko, Samuel et al. (2010) Discovery and verification of amyotrophic lateral sclerosis biomarkers by proteomics. Muscle Nerve 42:104-11 |
Lustgarten, Jonathan L; Gopalakrishnan, Vanathi; Visweswaran, Shyam (2009) Measuring stability of feature selection in biomedical datasets. AMIA Annu Symp Proc 2009:406-10 |
Lustgarten, Jonathan L; Visweswaran, Shyam; Bowser, Robert P et al. (2009) Knowledge-based variable selection for learning rules from proteomic data. BMC Bioinformatics 10 Suppl 9:S16 |
Ranganathan, Srikanth; Williams, Eric; Ganchev, Philip et al. (2005) Proteomic profiling of cerebrospinal fluid identifies biomarkers for amyotrophic lateral sclerosis. J Neurochem 95:1461-71 |