The goal of this research is to develop scalable methods for individual identification in large biological databases with application to multiple species in ongoing conservation efforts around the world. This work couples crowd-sourcing with computational learning. In an identification system that learns from user inputs, multiple inputs can accelerate the quality and rate of identification, which in turn dramatically reduces subsequent work-cycles any human performs. The hypothesis is that Computational learning will improve the quality of human-machine coupled system's solutions to help solve large and diverse identification problems. Several interactive learning algorithms are proposed in all stages of indexing and search in Biological Image Databases. The algorithms are based on non-parametric Bayesian inference and deliver incremental online learning methods. To test the interactive learning hypothesis, the MIT Sloop system will be extended to include the proposed interactive learning tools. Sloop is a robust toolkit for vision tools that has been tested on multiple species and supports an operational deployment. As part of this research a distributed Sloop system indexing multiple species will be developed.

It is difficult to accurately estimate the effectiveness of conservation efforts for many rare and endangered species without an ability to quantify the spatial scales and other statistics of animal migration and movement. Tagging, an established method, is of limited effectiveness because it is often invasive and cannot be conducted in large numbers. This research looks at whether a large number of animals in multiple species can be identified individually using photographs stored in a database. The "Animal biometrics" proposed here will examine how citizen scientists can crowd-source relevance judgments and other inputs, how judgments can improve the computer's identification performance, and how improved performance reduces the citizen or expert scientist's workload. The mechanics of the symbiotic human-computer interaction mediated by machine learning, and the methods by which patterns are indexed and searched also have impacts in other fields. For example, some of the tools developed here have been applied in Geosciences and Weather Prediction.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
1146747
Program Officer
Jennifer Weller
Project Start
Project End
Budget Start
2012-06-15
Budget End
2016-05-31
Support Year
Fiscal Year
2011
Total Cost
$432,477
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02139