Substitution Matrices Into the Nsp-Tree in Biological Sequence Databases

Qian, Gang

Abstract

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. A basic operation on biological sequence databases is to locate homologous regions for a given query sequence using pair-wise alignments. Unfortunately. the dynamic programming algorithm used for sequence alignments is computationally expensive, making it prohibitive for today's rapidly-growing sequence databases. Existing alignment tools, such as FAST A and BLAST. though fast in locating candidate homologous regions, sacrifice sensitivity for efficiency -they may miss some true homologous regions in database sequences. In this project, we will develop novel indexing algorithms for large biological databases that support efficient pair-wise sequence alignments with high sensitivity. Specifically, we will incorporate widely-used substitution matrices, such as PAM and BLOSUM, into the construction algorithms of the NSP-tree (an index structure designed for sequence data) so that sequences with evolutionarily-related letters are grouped together in the structure of the NSP-tree. As a result, indexed sequence groups with unrelated letters will obtain a low score when aligned to a given query sequence, and be promptly pruned. By enhancing the pruning power of the NSP-tree, we expect that the new index-based approach will provide high sensitivity while maintaining a comparable or even higher level of efficiency than that of existing pair-wise alignment tools. The project will be conducted in four steps: 1) Developing a new dynamic programming query algorithm to handle the alignments between a query sequence and sequence groups indexed in the tree;2) Based on the substitution matrices, analyzing functionally conservative leiters in biological sequences, and creating a clustering tree that hierarchically organizes the proximity of the letters based on their evolutionary closeness;3) Designing new heuristics that incorporate the clustering tree of letters into the construction algorithms of the NSP-tree;and 4) Conducting experimental studies on the performance of the new heuristics and comparing the performance of the NSP-tree with that of the existing tools.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Research Resources (NCRR)
Type: Exploratory Grants (P20)
Project #: 5P20RR016478-10
Application #: 8167540
Study Section: Special Emphasis Panel (ZRR1-RI-4 (01))

Project Start: 2010-04-01
Project End: 2011-03-31
Budget Start: 2010-04-01
Budget End: 2011-03-31
Support Year: 10
Fiscal Year: 2010
Total Cost: $29,674
Indirect Cost

Institution

Name: University of Oklahoma Health Sciences Center
Department: Microbiology/Immun/Virology
Type: Schools of Medicine
DUNS #: 878648294

City: Oklahoma City
State: OK
Country: United States
Zip Code: 73117

Related projects

Publications

Hu, Zihua; Jiang, Kaiyu; Frank, Mark Barton et al. (2018) Modeling Transcriptional Rewiring in Neutrophils Through the Course of Treated Juvenile Idiopathic Arthritis. Sci Rep 8:7805

Wetherill, Marianna S; Williams, Mary B; Gray, Karen A (2017) SNAP-Based Incentive Programs at Farmers' Markets: Adaptation Considerations for Temporary Assistance for Needy Families (TANF) Recipients. J Nutr Educ Behav 49:743-751.e1

Hannafon, Bethany N; Trigoso, Yvonne D; Calloway, Cameron L et al. (2016) Plasma exosome microRNAs are indicative of breast cancer. Breast Cancer Res 18:90

Wilson, Kevin R; Cannon-Smith, Desiray J; Burke, Benjamin P et al. (2016) Synthesis and structural studies of two pyridine-armed reinforced cyclen chelators and their transition metal complexes. Polyhedron 114:118-127

Trigoso, Yvonne D; Evans, Russell C; Karsten, William E et al. (2016) Cloning, Expression, and Purification of Histidine-Tagged Escherichia coli Dihydrodipicolinate Reductase. PLoS One 11:e0146525

Khandaker, Morshed; Riahinezhad, Shahram; Sultana, Fariha et al. (2016) Peen treatment on a titanium implant: effect of roughness, osteoblast cell functions, and bonding with bone cement. Int J Nanomedicine 11:585-94

Hu, Zihua; Jiang, Kaiyu; Frank, Mark Barton et al. (2016) Complexity and Specificity of the Neutrophil Transcriptomes in Juvenile Idiopathic Arthritis. Sci Rep 6:27453

Seong, Jaehoon; Jeong, Woowon; Smith, Nataliya et al. (2015) Hemodynamic effects of long-term morphological changes in the human carotid sinus. J Biomech 48:956-62

Day, Michael W; Jackson, Lydgia A; Akins, Darrin R et al. (2015) Whole-Genome Sequences of the Archetypal K1 Escherichia coli Neonatal Isolate RS218 and Contemporary Neonatal Bacteremia Clinical Isolates SCB11, SCB12, and SCB15. Genome Announc 3:

Hannafon, Bethany N; Carpenter, Karla J; Berry, William L et al. (2015) Exosome-mediated microRNA signaling from breast cancer cells is altered by the anti-angiogenesis agent docosahexaenoic acid (DHA). Mol Cancer 14:133

Showing the most recent 10 out of 165 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: