Utilizing Teragrid to Detect Remote Similarity Protein Sequences

Fienup, Mark

Abstract

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. The structure of a protein is often a key to its function. However, significant time and cost is required to determine the structure of a protein by experimental methods, such as the X-ray crystallography or the Nuclear Magnetic Resonance. There are currently less than 50,000 protein structures deposited in the Protein Data Bank (PDB), of which about 80% are redundant. On the other hand, the genomic sequencing efforts, such as the Human Genome Project, have populated protein sequence databases with well over 5 million sequences. With the increasing gap between known sequences and experimentally determined structures, the computational methods capable of predicting the structure and function of proteins will play an increasing role in protein annotation studies. The ultimate goal of the research described in this proposal is to develop a new protein sequence homology detection method that leverages the growing body of protein sequence data in ways that existing methods do not. The increased sensitivity in recognizing relationships between amino acid sequences will be achieved through the applications of intermediate sequence search strategies and profile-profile techniques. To date, the progress in this area has been limited by the lack of the computational resources needed to perform the transitive profile-profile search. We propose to utilize the TeraGrid to develop and test the first intermediate profile-profile algorithm for detecting protein sequence similarities. The algorithm constructs a sequential profile for the input amino acid sequence (target) and uses it to transitively search the database of all representative profiles for sequences in nr. In the transitive search, the matches found after running the first sequence comparison are used as new queries against the database. The whole process is repeated, iteratively with these new matches. The similarity between the target profile and the profile from the database is established through the intermediate sequences. Our project will be carried out in two stages: 1. In the first stage we will generate the set of representative alignment profiles for sequences from the non-redundant protein sequence database nr. 2. In the second phase we will deploy and test our algorithm.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Research Resources (NCRR)
Type: Biotechnology Resource Grants (P41)
Project #: 5P41RR006009-19
Application #: 7956240
Study Section: Special Emphasis Panel (ZRG1-BCMB-Q (40))

Project Start: 2009-08-01
Project End: 2010-07-31
Budget Start: 2009-08-01
Budget End: 2010-07-31
Support Year: 19
Fiscal Year: 2009
Total Cost: $771
Indirect Cost

Institution

Name: Carnegie-Mellon University
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 052184116

City: Pittsburgh
State: PA
Country: United States
Zip Code: 15213

Related projects

Publications

Simakov, Nikolay A; Kurnikova, Maria G (2018) Membrane Position Dependency of the pKa and Conductivity of the Protein Ion Channel. J Membr Biol 251:393-404

Yonkunas, Michael; Buddhadev, Maiti; Flores Canales, Jose C et al. (2017) Configurational Preference of the Glutamate Receptor Ligand Binding Domain Dimers. Biophys J 112:2291-2300

Hwang, Wonmuk; Lang, Matthew J; Karplus, Martin (2017) Kinesin motility is driven by subdomain dynamics. Elife 6:

Earley, Lauriel F; Powers, John M; Adachi, Kei et al. (2017) Adeno-associated Virus (AAV) Assembly-Activating Protein Is Not an Essential Requirement for Capsid Assembly of AAV Serotypes 4, 5, and 11. J Virol 91:

Subramanian, Sandeep; Chaparala, Srilakshmi; Avali, Viji et al. (2016) A pilot study on the prevalence of DNA palindromes in breast cancer genomes. BMC Med Genomics 9:73

Ramakrishnan, N; Tourdot, Richard W; Radhakrishnan, Ravi (2016) Thermodynamic free energy methods to investigate shape transitions in bilayer membranes. Int J Adv Eng Sci Appl Math 8:88-100

Zhang, Yimeng; Li, Xiong; Samonds, Jason M et al. (2016) Relating functional connectivity in V1 neural circuits and 3D natural scenes using Boltzmann machines. Vision Res 120:121-31

Lee, Wei-Chung Allen; Bonin, Vincent; Reed, Michael et al. (2016) Anatomy and function of an excitatory network in the visual cortex. Nature 532:370-4

Murty, Vishnu P; Calabro, Finnegan; Luna, Beatriz (2016) The role of experience in adolescent cognitive development: Integration of executive, memory, and mesolimbic systems. Neurosci Biobehav Rev 70:46-58

Shafee, Rebecca; Buckner, Randy L; Fischl, Bruce (2015) Gray matter myelination of 1555 human brains using partial volume corrected MRI images. Neuroimage 105:473-85

Showing the most recent 10 out of 292 publications

Comments

Be the first to comment on Mark Fienup's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: