Markovian Models for Protein Identification From Tandem Mass Spectrometry

Gopalakrishnan, Vanathi

Abstract

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. Primary support for the subproject and the subproject's principal investigator may have been provided by other sources, including other NIH sources. The Total Cost listed for the subproject likely represents the estimated amount of Center infrastructure utilized by the subproject, not direct funding provided by the NCRR grant to the subproject or subproject staff. Biomedical Research has been revolutionized with technological advances leading to massive accumulation of data. All this data now needs to be mined in order to draw actionable insights into the various biological processes. Complex machine learning algorithms are being developed to perform automated analyses of these large datasets and to come up with robust models that explain the observed data. Such models are then used to identify patterns in data that enable solving of challenging decision problems like diagnosis and prognosis of disease. Our research involves one such class of algorithms called Hidden Markov Models, which are used extensively in sequential data mining problems in Biology. Our particular focus is on development of novel algorithms for identification and quantification of protein sequences in complex biological samples using data that comes out of mass spectrometers. Such analysis will lead to molecular characterization of target conditions like diseased states. Our algorithms involve learning models from large training datasets and are computationally intensive. Additionally, in order to learn a robust model that will perform well across a variety of future test data, we are proposing to perform large-scale experiments with different model topologies and features, and require learning of hundreds of different models worth many days of number-crunching work. However, the entire experimentation can be parallelized trivially since all the models can be learned independently from each other and hence, the need for computing machines that can run multiple jobs in parallel. Our algorithms (homegrown) have been implemented using Python programming language and can take advantage of presence of multiple processing units or cores. After speaking with consultants at PSC, we were suggested that the Blacklight machines are most suitable for our needs.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Research Resources (NCRR)
Type: Biotechnology Resource Grants (P41)
Project #: 3P41RR006009-20S1
Application #: 8364375
Study Section: Special Emphasis Panel (ZRG1-BCMB-Q (40))

Project Start: 2011-09-15
Project End: 2013-07-31
Budget Start: 2011-09-15
Budget End: 2013-07-31
Support Year: 20
Fiscal Year: 2011
Total Cost: $1,094
Indirect Cost

Institution

Name: Carnegie-Mellon University
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 052184116

City: Pittsburgh
State: PA
Country: United States
Zip Code: 15213

Related projects

Publications

Simakov, Nikolay A; Kurnikova, Maria G (2018) Membrane Position Dependency of the pKa and Conductivity of the Protein Ion Channel. J Membr Biol 251:393-404

Yonkunas, Michael; Buddhadev, Maiti; Flores Canales, Jose C et al. (2017) Configurational Preference of the Glutamate Receptor Ligand Binding Domain Dimers. Biophys J 112:2291-2300

Hwang, Wonmuk; Lang, Matthew J; Karplus, Martin (2017) Kinesin motility is driven by subdomain dynamics. Elife 6:

Earley, Lauriel F; Powers, John M; Adachi, Kei et al. (2017) Adeno-associated Virus (AAV) Assembly-Activating Protein Is Not an Essential Requirement for Capsid Assembly of AAV Serotypes 4, 5, and 11. J Virol 91:

Subramanian, Sandeep; Chaparala, Srilakshmi; Avali, Viji et al. (2016) A pilot study on the prevalence of DNA palindromes in breast cancer genomes. BMC Med Genomics 9:73

Ramakrishnan, N; Tourdot, Richard W; Radhakrishnan, Ravi (2016) Thermodynamic free energy methods to investigate shape transitions in bilayer membranes. Int J Adv Eng Sci Appl Math 8:88-100

Zhang, Yimeng; Li, Xiong; Samonds, Jason M et al. (2016) Relating functional connectivity in V1 neural circuits and 3D natural scenes using Boltzmann machines. Vision Res 120:121-31

Lee, Wei-Chung Allen; Bonin, Vincent; Reed, Michael et al. (2016) Anatomy and function of an excitatory network in the visual cortex. Nature 532:370-4

Murty, Vishnu P; Calabro, Finnegan; Luna, Beatriz (2016) The role of experience in adolescent cognitive development: Integration of executive, memory, and mesolimbic systems. Neurosci Biobehav Rev 70:46-58

Kuhlman, Chris J; Anil Kumar, V S; Marathe, Madhav V et al. (2015) Inhibiting diffusion of complex contagions in social networks: theoretical and experimental results. Data Min Knowl Discov 29:423-465

Showing the most recent 10 out of 292 publications

Comments

Be the first to comment on Vanathi Gopalakrishnan's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: