Mathematical models with nonnegative data values are abounding in sciences and engineering. For the sake of physical feasibility and interpretability, the nature of nonnegative must be retained in computation and analysis. This work concerns itself with the factorization of nonnegative matrix into product of lower rank nonnegative matrices. Such a notion of the nonnegative matrix factorization plays a major role in a wide range of important applications including text mining, cheminformatics, factor retrieval, image articulation, bioinformatics, and in dimension reduction and clustering in pattern and data analysis. The discoveries from this proposed research are expected to impact not only the advanced theoretical foundations of matrix computation, but also contribute to the general areas of data mining such as dimension reduction, clustering, and visualization.

The basic question behind the nonnegative matrix factorization (NMF) is to best approximate a given nonnegative data matrix as the product of two lower dimensional and, hence, lower rank nonnegative matrices. The two lower rank matrices provides lot of essential information that, otherwise, would be difficult to retrieve from the original matrix. Many NMF techniques have been proposed in the literature, yet there is still little theory on how the NMF can be robustly and efficiently solved. In this work, development of new faster algorithms will be conducted through structured and comprehensive performance evaluation of promising research directions, including the active set and geometry based algorithms, against real-world application data to obtain valuable insights. The proposed study of the geometric structure of the NMF and theoretical properties of the NMF algorithms, such as convergence, should provide the basis of assessment for any NMF methods. Applicability of the NMF to dimension reduction and clustering will also be investigated.

Results of this research are also likely to have potential applications in database management, medical examination and diagnosis, bio-chemical selection, and biological networks.

Project Start
Project End
Budget Start
2007-10-01
Budget End
2013-09-30
Support Year
Fiscal Year
2007
Total Cost
$251,828
Indirect Cost
Name
Georgia Tech Research Corporation
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30332