This project, aiming at acquiring a cluster of dual processor computers with large memory and storage for computational research into large scale data problems, addresses research in Bio-informatics (modeling molecular pathways, predicting genes, and protein folding), information retrieval (for Natural language understanding, OCR for Digital libraries, image annotation, information extraction from the web, and learning representations), and systems (garbage collection and memory management algorithms, network, distributed agents). The work in computational biology is expected to provide better understanding of the genomes and protein structures of several species, and research on new algorithms to lead to better tools for allocating computation on large clusters and for debugging code on such clusters. Research using improved data rich statistical models should enhance the state of the art of machine learning, web search and information retrieval, information extraction, print and handwriting recognition, and machine translation, leading to better systems for web data and better translation systems. Research into large numbers of interacting agents might contribute in the building of better e-commerce systems. The overall project, involving 2 other institutions (Smith and Mt. Holyoke), counts with 16 investigators, 1 research scientist, 4 postdocs, and 70 graduate and 19 undergraduate students.