Massive datasets occur in all types of settings ranging from the highly scientific to the ubiquitous internet. Making sense of this massive data requires sophisticated computer sciences techniques such as data classification, approximation and optimization. All of these techniques can be improved substantially by making effective use of prior knowledge that is often readily available. For example doctors' experience can be utilized in obtaining improved classifiers for various types of important problems, such as medical diagnosis and prognosis. Since the most powerful state-of the-art classifiers are based on support vector machines, which in turn are formulated as constrained or unconstrained optimization problems, the aim is that prior knowledge be incorporated into various optimization-based applications such as classification and approximation problems as well into the theory of optimization itself. To a large degree, this proposal is motivated by the investigators' extensive collaborative work with oncologists, surgeons and medical physicists and the investigators' desire to make full use of the expertise of such practitioners by incorporating it into computable but rigorous models.

The intellectual merit of the proposed work lies in the use of rigorous theory and problem analysis techniques that incorporate domain specific information into general optimization problems. The research will first incorporate knowledge into a linear or nonlinear support vector machine classifier and show that such incorporation is possible by appending additional constraints to the original problem. Preliminary tests indicate improvements in classifier accuracy. Secondly, prior knowledge will be introduced into approximation problems. Thus, in addition to given discrete data that is normally used to generate an approximation to an unknown function, prior knowledge is also taken into account. Finally, prior knowledge will be incorporated into general constrained or unconstrained optimization problems, wherein the prior knowledge consists of new constraints to be imposed on the behavior of the objective function on various regions. The generality of these new techniques will facilitate the integration of information from disparate sources, since the theory allows multiple sets of prior information to be included concurrently. Specific application to radiotherapy treatment planning problems will ensure the computer science advancements are demonstrably useful in a particular problem domain.

The optimization, modeling, and computational techniques will provide a boost to advances in cancer diagnosis and prognosis, chemotherapy, and other treatment regimes. The knowledge-based approach encompasses a broad spectrum of important classification and approximation problems that have wide applicability in science and engineering. The work will also raise the profile of data mining techniques in other areas such as surgery, pharmacology, and medical research, by demonstrating how the methodologies can be utilized to incorporate prior knowledge into both planning and design issues, and improving both efficiency of delivery and effectiveness of treatment in many clinical settings. By coupling the education of several computer science and engineering students with the proposed work, a new group of multidisciplinary researchers will be trained that will ensure the technical advances are applied to further application domains.

Project Start
Project End
Budget Start
2005-09-01
Budget End
2009-08-31
Support Year
Fiscal Year
2005
Total Cost
$496,040
Indirect Cost
Name
University of Wisconsin Madison
Department
Type
DUNS #
City
Madison
State
WI
Country
United States
Zip Code
53715