Search engines and other information retrieval technologies are critical in the digital age. The goal of the proposed research is to investigate novel frameworks for analyzing and efficiently evaluating measures of retrieval performance, with an eye toward fostering and enabling research leading to better search engines and other retrieval technologies.
Two novel frameworks are proposed: (1) an information-theoretic framework within which one can quantifiably assess what various measures of retrieval performance are measuring and (2) a statistical framework within which one can efficiently estimate these measures of retrieval performance. The former provides a theoretical underpinning for retrieval evaluation and analysis; the latter provides a practical methodology for efficiently evaluating search engines on a large scale. Each will foster and enable research leading to better search engines and search technology.
The potential impacts of this project are many. From a research and infrastructure perspective, the project will yield published research results and freely available software artifacts (via www.ccs.neu.edu/home/jaa/IIS-0534482/), which will permit large-scale retrieval evaluation with minimal effort. Academics and other technologists will be able to efficiently test and evaluate new retrieval algorithms on novel data sets without incurring the high costs associated with the standard information retrieval evaluation paradigm. As such, one aspect of the project is an enabling technology that will foster the more rapid development of new search algorithms, vital in the current digital age. From an educational perspective, the project will train graduate and undergraduate students.