This grant provides funding for the development of novel optimization tools to be used for data mining. The work will focus on the incorporation of inputs from disparate sources with different levels of reliability. These inputs will be allowed affect the patterns to a degree that depends on the confidence level attributed to the inputs. The optimization and algorithms developed will also rely on pairwise comparisons, or separation measures, rather than on parametric mapping of the attributes alone. The data mining outcome, in the form of classification, will be an optimal solution to a penalty minimization objective. The penalty is assigned to be higher for deviating from opinions and pairwise comparisons that are more reliable and it will be smaller penalty for opinions and pairwise comparisons for less reliable sources. This family of techniques will be tested for effectiveness against existing methodologies in areas of patient prognosis; customer segmentation; and country or firm credit assessment. The testing will result in calibration and fine tuning of the penalty functions appropriate for use in different contexts.

If successful, the data mining techniques are expected to have impact on pattern recognition and on methodologies for capturing expert knowledge. It will enable to incorporate and include expert assessments along side empirical data, and scientific theory predictions each contributing to the final pattern outcome depending on the confidence in the input from each source. Potential applications of the research include financial engineering and health care.

Project Start
Project End
Budget Start
2006-08-15
Budget End
2012-07-31
Support Year
Fiscal Year
2006
Total Cost
$345,453
Indirect Cost
Name
University of California Berkeley
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94704