Structured modeling situations, with multivariate data involving tensors, are treated using constrained likelihood approaches, as an efficient means to exploit lower-dimensional structure for high-order analysis. In gene network analysis, for example, a constrained approach helps reveal gene-gene relations in a context of multiple graphical models. Special attention is devoted to the appropriate choice of the constraints for adaptation to a variety of structures. The general theme of the proposed project is the development of statistical methods of practical utility, both in prediction and estimation. In particular, the proposed project develops methods for (a) multiple graphical models for structure extraction, and for (b) high-order analysis of tensor data. The proposed research is primarily motivated by challenging problems that arise in gene network analysis and collaborative filtering, where one central issue is how to leverage and utilize lower-dimensional structure to battle high statistical uncertainty in a discovery process. New techniques are proposed and investigated, both computationally and statistically, which target biomedical and engineering problems. In (a) and (b), our effort will be on classification and regression, and on structure adaptation through tensor decomposition and factorization, with most effort focused towards condition specific extraction of lower-dimensional structure.

Modern scientific and engineering investigation, as in biomedical research and computer vision, now produces enormous data that aim to simultaneously explore relations among hundreds and thousands interacting units. This project proposes methods for treating the new scientific environment. The project develops technology that is directly applicable to applied research, particularly in automatic machine processing and data mining, biomedical research, advertisement, and economics. Plans for technology transfer are described, in addition to an educational program that will train students in statistical learning and data mining. Educational activities include developing a course, and attracting undergraduate students to research.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1207771
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2012-07-01
Budget End
2016-06-30
Support Year
Fiscal Year
2012
Total Cost
$200,165
Indirect Cost
Name
University of Minnesota Twin Cities
Department
Type
DUNS #
City
Minneapolis
State
MN
Country
United States
Zip Code
55455