9224602 Shasha SDB: Discovering Motifs in Scientific Databases This is the first year funding of a three-year continuing award. This research is carried out in collaboration with Bruce Shapiro, National Institutes of Health. Scientific progress often results from discovering structural commonalities that explain similar behavior. For example, in molecular biology, a set of proteins or DNA sequences may express similar functionality in nature. This project aims to help scientists discover common sequence or topological patterns that explain the similarity. Pattern discovery entails generating pattern guesses in a systematic way and testing them. The tests are based on approximate pattern matching algorithms that yield distance metrics. Thus, commonalities may be approximate. The main research milestones are a family of algorithms for pattern discovery, query processing, data organization and index manipulation. The algorithms are to be tested on data drawn from the National Institutes of Health and from public genome databases. Whereas some of the algorithms are specific to the combinatorial structures present in biology, many of the techniques should generalize to any application that seeks to find patterns in databases. This project will help scientists discover patterns in large databases that determine natural behavior. Such patterns may lead to new drug design or to new treatments. ***

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9224601
Program Officer
Program Director
Project Start
Project End
Budget Start
1993-08-01
Budget End
1997-01-31
Support Year
Fiscal Year
1992
Total Cost
$194,352
Indirect Cost
Name
New York University
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10012