This project addresses the task of automatically detecting the influence structure and flow of ideas in document corpora that have grown over time (e.g. scientific literature, political debates, news, email, wikis, blogs) in order to trace the origin and development of ideas over time. The intellectual merit of this project lies in the development of statistically well-founded methods for discovering and analyzing the influence structure in evolving archives. In particular, this project will focus on the development of statistical tests for two relations that are central to understanding the structure of an archive and how ideas developed - namely originality and influence. The ability to detect influence and origin of ideas will be of substantial help in understanding, exploring, interpreting, visualizing, and aggregating the rapidly growing body of historical text available online. The project will also evaluate in how these methods augment traditional citation analysis in hyperlinked collections, and whether they allow similar functionality even in non-hyperlinked archives. The new capability will benefit a number of widely used applications, including search engines for internet content.