The next generation of machine learning systems will need to be substantially more flexible than current systems. Machine learning systems will need to be able to grow new structure as needed, to take into account repeated substructures that arise from relational knowledge, to deal with abstraction hierarchies, and to cope more gracefully with heterogeneous data. This project addresses these issues. It aims at problems both in the Bayesian approach to machine learning (specifically, graphical models) and the frequentist approach to machine learning (specifically, kernel machines). In the graphical model setting, the PI describes a new approach to structure learning based on a flexible prior known as the Chinese restaurant process (CRP)." It explores generalization of the CRP referred to as the hierarchical Dirichlet process" that makes it possible to take into account repeated or partially-repeated sub-structures. It also presents explores a generalization of the CRP that referred to as the nested Chinese restaurant process" for learning abstraction hierarchies.
In the area of kernel machines, the PI builds on his previous NSF-sponsored work to consider methods for combining heterogeneous kernels based on tools from convex optimization, in particular semidefinite programming. He will use these ideas to define novel feature selection methods, and to design new algorithms for the semidefinite programming approach that are the analog of the sequential minimal optimization" (SMO) algorithm for quadratic programming that have permitted the rise to prominence of the support vector machine. The project will focus on driving applications in the areas of information retrieval, bioinformatics, bug-finding in computer programs, and sensor networks.