A rapid acceleration in both volume and complexity of public domain and scientific data presents new and exciting challenges. This project aims to develop a theoretical framework for structured learning of distribution spaces and study tools for identifying and utilizing probabilistic structure in high-dimensional large volume data. This project lies within the intersection of multiple disciplines: signal processing, pattern recognition, machine learning, probability and statistics, and thus will foster collaboration among these disciplines. The application of the proposed framework to data-driven medical diagnosis and ecological research will further the impact of this project beyond the realm of computational data analysis. Additionally, this research sets a goal to enrich the quality of education for both undergraduate and graduate students, through exciting integration of research, application, and new curriculum.

The research framework consists of geometrically-constrained probabilistic modeling and efficient optimization approaches for inference of multiple instance data. The project sets forth the following tasks i) confidence-constrained joint estimation of multiple discrete probability models, ii) joint learning of multiple distribution based geometrically-constrained maximum-entropy models, and iii) direct application of the developed framework to the analysis of clinical flow cytometry data for medical diagnosis and in-situ bioacoustics data for ecological research.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1254218
Program Officer
Phillip Regalia
Project Start
Project End
Budget Start
2013-04-01
Budget End
2019-03-31
Support Year
Fiscal Year
2012
Total Cost
$468,077
Indirect Cost
Name
Oregon State University
Department
Type
DUNS #
City
Corvallis
State
OR
Country
United States
Zip Code
97331