0204109 Haesun Park University of Minnesota- Twin Cities
Due to today's exponential growth of the internet and computing power, an information retrieval system is expected to handle a tremendous amount of data, and users demand more efficient techniques to obtain useful information from the flood of data. The goal of this proposed research is to find lower dimensional representations of text data in vector space based information retrieval. Dimension reduction is imperative for achieving high efficiency and effectiveness in manipulating the massive quantity of data in today's information retrieval system. A problem of fundamental importance here is to achieve better representation of the data with relatively severe dimension reduction, rather than simple dimension reduction through a lower rank approximation of a matrix. One difficulty is that it is not easy to measure by a theoretical formula how well a certain dimension reduction method provides a good representation of the original data, and it will be essential to conduct theoretical research in parallel with experimental study.