This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).

Many real world applications need to compress, sort, and otherwise manipulate large volumes of multidimensional data arrays (i.e., higher- order tensors), so there is an increasing need for theoretical and computational tools to deal with multiway data. The subject of multilinear algebra and tensors has gained increasing prominence in the past decade due to a surge in applications such as approximation of Newton potentials and stochastic PDEs, image compression and deblurring, network traffic analysis, biological assay interpretation, unmixing of signals, and many more. Such applications with higher-order tensors involve factorizations of the tensor. There are multiple possible extensions of matrix factorizations to higher-order tensors (e.g. extensions of the matrix SVD), with some more amenable to certain applications. The investigators are advancing the state-of-the-art in both theoretical and computational multilinear algebra via several newly developed tensor constructions based on new notions of tensor multiplication, orthogonality and diagonalizability. Algorithms with compression schemes based on these new constructions are being implemented by the investigators and tested on several datasets from various applications including handwritten digit identification, genomics, the spectral unmixing problem, video compression, and computer image recognition.

Current applications in the sciences can involve analysis, classification, searching and compression of large volumes of data that is "multidimensional" in nature. Consider, for example, the problem of facial recognition, used to identify a terrorist from within a database of images of known terrorists. The database can be considered multidimensional data in the sense that for each individual there corresponds a specific viewpoint, illumination, and facial expression. It is critical in this scenario to have an accurate and fast algorithm to match an unknown image against a database of images of known terrorists. Another example where a multidimensional representation is useful is genomic data. Here, DNA microarray two-dimensional tabular data from different experiments is concatenated into a multidimensional array. Recently published results have indicated that so-called 'factorizations' of this multidimensional data can be used to discover new molecular-level interactions. Hence, along with advances in computer architecture to store large datasets must come mathematically sound models for the compression and/or analysis of such data. Development of new concepts and ideas is therefore required to deal with the different geometries that arise in the multidimensional case. The investigators are contributing directly to this effort by developing innovative mathematical theory for multidimensional objects that is consistent with two-dimensional proven ideas. With theoretical constructs in place, the investigators are able to create computational tools and algorithms to analyze and compress multidimensional data. A significant component of the proposal is the involvement of undergraduates, graduate students, and researchers. The investigators are leveraging the strengths of both universities in a novel, inter-institutional vertical integrative experience for all students (undergraduate and graduate) involved in the research. This arrangement allows students at all levels, mentored by leading researchers in the field, to advance the state-of-the-art in the analysis and compression of multidimensional data.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0914957
Program Officer
Junping Wang
Project Start
Project End
Budget Start
2009-08-01
Budget End
2012-07-31
Support Year
Fiscal Year
2009
Total Cost
$221,216
Indirect Cost
Name
Tufts University
Department
Type
DUNS #
City
Medford
State
MA
Country
United States
Zip Code
02155