Focusing on the domain of biology, this project will develop a suite of tools that will enable scientists to integrate data from multiple web sources, to help visualize and manipulate this integrated data, and to republish this newly integrated data to the larger scientific community on the web. This project aims to make web-based, complex data integration into an activity that can be performed by individual scientists on an ad hoc basis, collecting, manipulating, and publishing precisely the information that they want to work with. We address three aspects of the data integration problem. The project has four tasks: (1) develop tools that let non-programmers collect the information they wish to integrate, extracting it from numerous non-cooperating web sites, or from data repositories with disparate schemata, and structuring it into an integrated data model, (2) develop ontologies and tools that let users build their own task-specific information management applications, (3) develop ontologies and tools that let users build their own task-specific information management applications. The work will be evaluated by assembling a broad reference corpus of web data sources, primarily in biology, and measuring the ability of the components to collect, integrate, and redisseminate data from these sources. In addition, formative and summative user studies of the tools will be performed in the lab, tools will be deployed for biologists'use in their own work.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0712793
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
2007-09-15
Budget End
2010-08-31
Support Year
Fiscal Year
2007
Total Cost
$380,000
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02139