CAREER: Probabilistic Knowledge Discovery and Data Mining: An Integrated Approach at the Interface of ComputerScience and Statistics

Smyth, Padhraic

Abstract

This project involves the integration of ideas from computer science, mathematics, and statistics, in the context of their application to knowledge discovery and data mining of very large data sets. The project has two general research goals. The first involves the development of novel methods for exploration and identification of structure in multivariate data, with particular emphasis on clustering and density estimation. The second research goal is the development of novel methods for modeling sequential structure in data, in particular the use of graphical models to facilitate the process of model construction and estimation. The technical approach is based on the coupling of ideas from modern statistical modeling with computational techniques. A key feature of this work is the use of large-scale scientific and engineering data sets as testbeds, including upper-atmosphere spatio-temporal data records, a large medical data set consisting of heterogeneous data types for the study of Alzheimer's disease, planetary image data sets and associated annotations and catalogs of geologic features, and multivariate engineering sensor data from online system monitoring. The educational component of the project consists of the development of new courses which emphasize a first-principles understanding of model-exploration in the context of data analysis, as well as opportunities for students to participate in inter-disciplinary, large- scale exploratory data mining projects. This project can have a significant impact on how large data sets are explored and analyzed across a wide variety of scientific, medical, and business disciplines.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 9703120
Program Officer: Maria Zemankova

Project Start
Project End
Budget Start: 1997-09-01
Budget End: 2003-08-31
Support Year
Fiscal Year: 1997
Total Cost: $304,379
Indirect Cost

CAREER: Probabilistic Knowledge Discovery and Data Mining: An Integrated Approach at the Interface of ComputerScience and Statistics
Smyth, Padhraic
University of California Irvine, Irvine, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments