Humans possess the ability to learn increasingly sophisticated representations of the world in which they live. In the visual domain, it is estimated that we are able to identify in the order of 30,000 object categories at multiple levels of granularity (e.g. toe-nail, toe, leg, human body, population). Moreover, humans continuously adapt their models of the world in response to data. Can we replicate this life-long-learning capacity in machines?
In this project, the PIs build hierarchical representations of data streams. The model complexity adapts to new structure in data by following a nonparametric Bayesian modeling paradigm. In particular, the depth and width of our hierarchical models grow over time. Deeper layers in this hierarchy represent more abstract concepts, such as ?a beach scene? or ?chair?, while lower levels correspond to parts, such as a ?patch of sand? or ?body part?. The formation of this hierarchy is guided by fast hierarchical bottom up segmentation of the images.
To process large amounts of information, the PIs distribute computation across many CPUs /GPUs. They develop novel fast inference techniques based on variational inference, memory bounded online inference, parallel sampling, and efficient data-structures.
The technology under development has a large number of potential applications ranging from organizing digital libraries and the worldwide web, building visual object recognition systems, successfully employing autonomous robots and training a ?virtual doctor? by processing worldwide information from hospitals about diseases, diagnosis and treatments.
Results are disseminated through scientific publications and publicly available software.