Access to up-to-date and quality information can have a significant impact on the humanitarian relief community as they coordinate relief efforts. In addition to data that is created and curated by experts, there is a vast volunteer community who are empowered by the social Web to generate community curated content on sites such as Flickr, Del.icio.us, and Google Earth. Combining data from experts and volunteers can facilitate the efforts of relief agencies. In order to effectively use this data, one needs to (1) discover relevant sources, (2) assess quality, and (3) understand their content. Fortunately, the wealth of content and metadata, i.e., annotations in the form of tags, on the social Web, can aid in this task of semantic discovery and quality assessment.

The GeoNets project will develop methods to analyze social content and metadata in order to extract concepts, including geospatial concepts and generate semantically-rich geospatial data. GeoNets can also increase the re-use of data by suggesting terms to improve the quality of existing annotations. GeoNets will develop methodologies for semantic discovery and quality assessment to create a GeoNets dataspace and provide a user friendly query language.

The methods developed by the project also apply to other fields where information created by a lay community augments the knowledge produced by professionals. In several scientific disciplines, including astronomy, biology and ecology, an army of passionate amateurs is making new observations and discoveries. GeoNets will create tools that will enable scientists to leverage community-generated knowledge to create up-to-date, semantically rich dataspaces."

Project Report

The Social Web sparked a revolution by putting knowledge production tools in the hands of ordinary people. Today on Social Web sites such as Twitter, Flickr, and YouTube, large numbers of users not only create rich content, including photos and videos, but also annotate it with descriptive labels known as tags, and geo-reference it by associating content with geographic coordinates, known as geo-tags. The implicit relationships between tags, their times and locations can be computationally mined to add a rich semantic layer to user-generated, community-curated content. As the quantity of user-generated content on the social Web continues to grow, extracting such knowledge will enable us to more effectively utilize user-generated content across a wide spectrum of applications, from monitoring the environment to managing resources, and interacting with the world and one another. Crisis and disaster management are the special focus of GeoNets project, since they exemplify situations in which decision-making requires rapid access to relevant, high quality information. While such information may be available, rapidly identifying it, understanding it and making it usable still presents a challenge. GeoNets has addressed this challenge by developing computational methods and tools for discovering relevant data sources, assessing their quality, and understanding their semantics. The project had three main components, described below. Harvesting Geospatial Knowledge from Social Annotations: User-generated content is increasingly geo-referenced by mobile phones and cameras, which associate geographic coordinates with the images and videos they take. As users post the content and discuss in on online social networks, the implicit relationships between the words they use and locations can tell us much about how people conceptualize places and relations between them. However, extracting such knowledge from online data presents many challenges, since such data is often ambiguous, noisy, uncertain and spatially inhomogeneous. GeoNets researchers from University of Southern California have developed a general and flexible probabilistic framework for modeling geo-referenced content to learn about places, their boundaries and relations between them. GeoNets researchers showed that their framework beats state-of-the-art approaches developed to extract place semantics and predict locations of photos from their tags. Researchers also used the same framework to learn accurate place boundaries, learn relations between places and assemble them within a directory of places. Semantic Integration of Geospatial Data using Karma: GeoNets project developed an integrated approach to extracting and fusing geospatial sources within a unified mashup building tool Karma. The focus of Karma has been on usability, to enable users who are comfortable using spreadsheets to perform complex geospatial data fusion tasks, such as importing data, normalizing it, integrating it with other data, invoking data fusion algorithms, visualizing the results and publishing new datasets with the fused data. Semantics plays a central role in Karma’s approach to information integration. When users import data, Karma semi-automatically builds models of the data according to a user-selected ontology. Semantic descriptions also allow Karma to propose meaningful "joins" across the data and publish fused data in variety of formats. Event Detection in Social Media: Scalability and accuracy are big challenges for processing large noisy data collected from social media. GeoNets researchers developed KeyGraph Event Detection (KED), a novel and efficient method that improves on the current topic detection and tracking methods by considering keyword co-occurrence. They showed that KED has similar accuracy when compared to the gold standard approaches for topic detection on small well annotated collections. Further, KED can successfully filter noise and identify events in social media collections. An extensive evaluation using Amazon Mechanical Turk demonstrated both the accuracy and recall of KED. The project engaged in several outreach activities, most notable of which was the Codeathon for Humanity at the the Tenth Annual Grace Hopper Celebration of Women in Computing in September 2010. Co-PI Louiqa Raschid represented the NSF GeoNets project and the Sahana Software Foundation and co-organized a Codeathon for Humanity. This program introduced 150 participants to the use of free open-source software for humanitarian applications and highlighted the use of standards and data exchange that is facilitated by the Sahana disaster data management platform.

Agency
National Science Foundation (NSF)
Institute
Division of Civil, Mechanical, and Manufacturing Innovation (CMMI)
Application #
0753124
Program Officer
Dennis Wenger
Project Start
Project End
Budget Start
2008-09-01
Budget End
2012-08-31
Support Year
Fiscal Year
2007
Total Cost
$623,880
Indirect Cost
Name
University of Southern California
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90089