Many people today, including news analysts, opinion pollsters, advertisers, and government regulation writers need to interpret, structure, and rapidly master large quantities of opinion-based text. New research is needed to develop text-processing tools that can perform advanced analysis of large text collections. This research will build on current text processing technologies, such as text clustering, text searching using information retrieval, and extractive summaries, to build and test tools tailored to the specific needs of government personnel working in an electronic rulemaking environment.

This project's focus is the federal government's several-thousand regulation writers, employed in some 200 agencies, who formulate, in a tightly scripted procedure, the rules and regulations that define the details of our laws. This project will attempt to solve several novel problems central to language processing research. In turn, it will deploy and evaluate a Rule-Writer's Workbench; a set of language tools that enables regulation writers, singly or jointly, to obtain a detailed and multidimensional overview of the material. . This research has the potential to impact far beyond IT and social science academia. It will explore such novel issues as author typing, opinion/affect determination, and near-duplicate detection. If even just a handful of the new technologies are effective, they eventually may help thousands of regulation writers more effectively communicate with and understand the comments of millions of citizens in our increasingly digitized society, and produce better regulatory rules for all of us.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0429102
Program Officer
Sylvia J. Spengler
Project Start
Project End
Budget Start
2004-09-01
Budget End
2009-08-31
Support Year
Fiscal Year
2004
Total Cost
$550,936
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213