Many government and commercial organizations possess extremely large databases, with sizes often measured in terabytes, containing such information as consumer data, astronomical observations, biological sequences, etc. The extraction of information from such large databases has become known as database mining and is an area where machine learning techniques must meet performance requirements of very large database systems. This research focuses on one particular database-mining task, the problem of rule discovery. Rule discovery is viewed as an interactive process with a human in the loop , an iterative process where the user is not only trying to discover interesting results, but also interesting questions to ask. The approach is based on the key idea that rules can themselves be viewed as objects. Under this view the space of possible rules supported by a database can itself be treated as a database, and the rule-discovery process can be approached as a process of querying the rule base implicitly defined by each database. The human in the loop user of the discovery system would interact with the system via ad hoc rulebase queries, designing the desired query interactively as various results are returned during a rule discovery session. The proposed implementation of the data mining system is tested on the data from the health-care field, obtained through an ongoing collaboration with a major provider of managed health care.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9509819
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
1995-09-01
Budget End
1999-02-28
Support Year
Fiscal Year
1995
Total Cost
$384,036
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
New Brunswick
State
NJ
Country
United States
Zip Code
08901