Increasingly, real-world data has many dimensions or features but rather than filling out all dimensions equally, the data distributed on or near a surface of much lower intrinsic dimension. Examples include fMRI data and natural image and video data sets. This research will push the boundary of what is currently possible in the analysis of such large data sets by incorporating hierarchical structure into what is known as a "sparse dictionary representation" of the data. The results will include both basic intellectual contributions to machine learning methods and computational advancements that will aid the investigation of complex real world data.

A fundamental problem in learning the structure of complex data is how to effectively extract a set of features that reflects the underlying structure of the data. In many applications, including face recognition and object recognition, sparse dictionary representations have proved effective for this purpose. However, since solving large-scale sparse representation problems is very expensive, the method is generally limited to problems of moderate scale. This research reformulates the method into an incremental, multi-stage, hierarchical dictionary learning process. This approach incrementally extracts information from the data and uses this to refine the data representation in an organized hierarchical fashion. This enables the building of large-scale dictionaries in a computationally efficient way. The method hence extends the power of sparse dictionary representation methods to a wider variety of real world applications. It also has the flexibility to incorporate an existing state-of-the-art sparse coding algorithm as the basic solver and hence can extend the functionality of existing sparse coding algorithms to multi-stage, hierarchical dictionary learning.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Princeton University
United States
Zip Code