Data Structures and Sequence Algorithms

Neuwald, A

Abstract

The development of computational tools in molecular biology, however, has often been hindered by the lack of predefined algorithms and data structures corresponding to operations on mathematical objects representing molecular sequences. Consequently, the development of sequence analysis methods often involves the need to first define (either explicitly or implicitly) and implement such objects and operations. This results in a duplication of effort and, in some cases, in poorly designed algorithms. This project seeks to address this problem. Mathematical objects representing various attributes of molecular sequences and commonly used operations on these objects were defined. These include: an alphabet (for proteins or nucleic acids), a sequence, a set of sequences, a sequence segment, an alignment of segments, a pattern (that is, a regular expression), a motif (that is, a model representing the frequency with which specific residues or bases occur at various positions in a local multiple alignment), and several types of scoring matrices. These objects and operations were implemented in the C programming language and have facilitated the development of a variety of new methods including a depth-first pattern searching algorithm, several Gibbs sampling methods for motif detection, and others. Often substantial development time can be saved. The code for these structures is being made available to the biological community over the network.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Intramural Research (Z01)
Project #: 1Z01LM000058-01
Application #: 3759326
Study Section

Project Start
Project End
Budget Start
Budget End
Support Year: 1
Fiscal Year: 1994
Total Cost
Indirect Cost

Institution

Name: National Library of Medicine
Department
Type
DUNS #

City
State
Country: United States
Zip Code

Related projects


NIH 1997 Z01 LM	Gibbs Sampling Propagation Algorithm for Multiple Sequence Alignment Lawrence, C E. / National Library of Medicine
NIH 1996 Z01 LM	Gibbs Sampling Propagation Algorithm for Multiple Sequence Alignment Lawrence, C E. / National Library of Medicine
NIH 1995 Z01 LM	Gibbs Sampling Propagation Algorithm for Multiple Sequence Alignment Lawrence, C E. / National Library of Medicine
NIH 1994 Z01 LM	Data Structures and Sequence Algorithms Neuwald, A F. / National Library of Medicine

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Related projects

Comments