With advances of computing technology, scientific and engineering investigation becomes increasingly complex. Powerful statistical techniques therefore are needed to exploit data with complex structure. The proposed project will be centered at design and analysis of inferential and prediction methods for problems involving complex statistical modeling that arise in, for instance, machine learning and data mining. The problems to be investigated include inference about a modeling process to account for modeling uncertainty in a modeling process, as well as prediction and inference in the contexts of multi-class margin classification and semi-supervised learning. The specific aims of the project, motivated by characteristics of complex modeling processes in our targeted applications, are focused on (1) the development of a novel theory of inference, as well as inferential procedures and computational tools, for comparing complex modeling processes; (2) the development of multi-class margin classification techniques; (3) the development of novel techniques for semi-supervised learning; and (4) the specific development of techniques for the targeted applications including object tracking, cancer genomics classification, and text categorization. For the targeted applications, the PI will collaborate with other scientists and engineers.
The proposed educational program will train graduate students for research in statistics. Success of this project will bring tremendous benefits to fundamental scientific and engineering research, particularly in automatic machine processing to combine humans' intelligence with machine's speed, in mining data with complex structure, and in biomedical research.