This project addresses the task of automatically detecting the influence structure and flow of ideas in document corpora that have grown over time (e.g. scientific literature, political debates, news, email, wikis, blogs) in order to trace the origin and development of ideas over time. The intellectual merit of this project lies in the development of statistically well-founded methods for discovering and analyzing the influence structure in evolving archives. In particular, this project will focus on the development of statistical tests for two relations that are central to understanding the structure of an archive and how ideas developed - namely originality and influence. The ability to detect influence and origin of ideas will be of substantial help in understanding, exploring, interpreting, visualizing, and aggregating the rapidly growing body of historical text available online. The project will also evaluate in how these methods augment traditional citation analysis in hyperlinked collections, and whether they allow similar functionality even in non-hyperlinked archives. The new capability will benefit a number of widely used applications, including search engines for internet content.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0812091
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2008-09-01
Budget End
2012-08-31
Support Year
Fiscal Year
2008
Total Cost
$449,578
Indirect Cost
Name
Cornell University
Department
Type
DUNS #
City
Ithaca
State
NY
Country
United States
Zip Code
14850