The Center for Intelligent Information Retrieval (CIIR) is investigating the impact of statistically derived semantic word relationships on information retrieval. Exploiting these relationships, for example, by identifying when different words express the same content can lead to more effective rankings of retrieval results. Semantic relationships are not labeled explicitly in text and are too varied to be identified solely by hand. The CIIR is mining massive corpora for direct and indirect word co-occurrence data using both offline and retrieval-time computation. The particular focus is on techniques that create and use Web-based corpora of "comparable" sentences and text chunks for estimating word and phrase translation probabilities, and on techniques that derive relationships from "context vectors" that represent word and phrase meanings. The quality of the word relationships that are discovered is being tested using large-scale retrieval experiments. In addition, the CIIR is addressing computational barriers to large-scale data mining by moving its new distributed computational framework, TupleFlow, to Hadoop. That framework was developed for the type of indexing and analysis operations that are required for large-scale studies of relational structure in text. TupleFlow is an extension of MapReduce, with advantages in flexibility, scalability, disk abstraction, and low abstraction penalties. This work is expected to have broad impact by improving the quality of search results.