This project will attempt to build a full text index to the textual web pages in the historical collections of the Internet Archive. The Internet Archive has taken a snapshot of the web every two months since 1996 and stored it. It now comprises approximately 40 billion web pages, consuming multiple petabytes of storage. The resulting index of the project may be the largest and best organized inverted index ever created that is freely available to academic researchers. It will enable social and information scientists to explore altogether new dimensions of contemporary events and practices, while offering information scientists a vital large-scale testing resource in areas such as advanced information retrieval on semistructured collections.