IRI-9525790 Chen, Hsinchun University of Arizona $200,755 - 36 mos Concept-based categorization and search on Internet: a machine learning, parallel computing approach The research is grounded on automatic textual analysis of Internet documents and (homepages) attempts to address the Internet search problem by first categorizing the content of Internet documents and subsequently providing semantic search capabilities based on a concept space and a genetic algorithm spider (agent). As a first step, a multi-layered neural network clustering algorithm employing the Kohonen self- organizing feature map categorizes the Internet homepages according to their content. The category hierarchies created could serve to partition the vast Internet services into subject-specific categories and databases. After individual subject categories have been created, domain-specific concept spaces (graphs of terms and their weighted relationships) are generated for each subject category. The concept spaces can then be used to support concept-based information retrieval, a significant improvement over the existing keyword searching and hypertext browsing options for Internet resource discovery. Lastly, using homepages in each subject category, a genetic algorithm client-based "spider" (agent) that can perform a global, stochastic search on the Internet based on individual searchers' preferences will be developed. While focused on Internet, the concepts and results of this research are expected to be generally applicable in other networked environments.