There is a tremendous amount of web content available today, but it is not always in a form that supports end-users' needs. For example, it is easy to find a list of hotels in Portland (provided a user specifies the intended state, e.g., Oregon vs. Maine or Texas or Connecticut, etc.), but not so easy to sort them by distance to the Portland convention center. All of the data and services needed to accomplish this goal already exist, but they are not in a form amenable to this task.

A rapidly growing community of developers is addressing this problem by creating "mashups" that combine existing web content and services in new ways. However, creating a mashup takes a high level of programming expertise. This proposal outlines the development of Marmite, a tool that will let everyday end-users create mashups by making it easy to extract content from web pages, process it in a data-flow manner, integrate it with other data sources, and direct it to a variety of useful sinks, such as databases, map services, and compilable source code that can be further customized.

This exploratory project focuses on three high-risk issues: (1) making it easy to select what content to crawl, (2) developing a hybrid dataflow / spreadsheet user interface (UI) that shows what content has been extracted and how that content is transformed, and (3) developing techniques for handling exceptions in the dataflow.

Success in this research will result in a tool that will let average web users create mashups, potentially stimulating the creation of many new kinds of services. It is expected that Marmite will be applicable across many web-based scenarios and to be of interest to startups and existing developers of mashups. This project plans to enlist the annual participation of 25 undergrad and graduate students through courses taught by the PI, and has the potential for technology transfer through corporate partners at Carnegie Mellon University's Human Computer Interaction Institute and CyLab. The project Web site (www.cs.cmu.edu/~jasonh/projects/marmite/) will be used for disseminating additional information and results.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0646526
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2006-09-15
Budget End
2008-02-29
Support Year
Fiscal Year
2006
Total Cost
$75,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213