Pattern Discovery in Combinatorial Databases: Algorithms, Applications, and Software for the Scientific Community

Wang, Jason

Abstract

This is an interinstitutional collaborative project. Combinatorial data consisting of sequences, trees, and graphs arise in many scientific disciplines. For example, the primary structure of proteins is a sequence, whereas the tertiary structure is a graph. Comparing such data to find similarities entails the use of a "distance metric" that mea sures the difference between two data items. Numerous distance metrics are possible. This work consists primarily of (i) inventing efficient ways to compute known distance metrics; (ii) developing a data structure to decide which of a set of data items is "closest" (according to a given distance metric) to a new data item; (iii) techniques and s oftware for discovering patterns with minimum or near-minimum distance to a given set of data items with respect to a given distance metric; and (iv) software to solve such discovery problems on networks of occasionally idle workstations. This work will help every field in which approximate matching is important. Significant applications are expe cted to molecular biology and rational drug design, as well as to finding patterns in linguistic strings.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 9531548
Program Officer: Maria Zemankova

Project Start
Project End
Budget Start: 1996-08-01
Budget End: 2000-01-31
Support Year
Fiscal Year: 1995
Total Cost: $207,532
Indirect Cost

Pattern Discovery in Combinatorial Databases: Algorithms, Applications, and Software for the Scientific Community
Wang, Jason
Rutgers University, Newark, NJ, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments