With National Science Foundation support, Dr. William Lewis will lead a team conducting eighteen months of research on linguistic ontology development and text mining. The project, with the help of Dr. Scott Farrar, will expand an existing linguistic ontology through extensive text mining of linguistic data structures that exist on the Web, focusing specifically on interlinear text found mostly in scholarly articles. Central to the text mining task will be the development of a set of programming tools for accessing and deploying the ontology. The ontology will guide the text mining in interpreting the data found on the Web, and the text mining will be used cyclically to develop the ontology.
As more and more linguistic data has been placed on the Web, the need for locating, accessing, and comparing these data resources has become essential to the discipline. Unfortunately, because the terminology used by various researchers differs, access to resources can be limited or difficult, and comparison across resources is generally not possible. A linguistic ontology represents a way of encoding the knowledge space of the field as a computational artifact that makes possible the automated comparison of resources, irrespective of the terminology in which these resources are encoded. Text mining of linguistic resources is an essential part of the proposed research, because it can allow a comprehensive picture of linguistic knowledge to develop, and when applied to ontology construction, can be used to make the ontology more "data-aware." Further, the development of programming tools to access the ontology will have the added benefit of making the ontology more readily accessible to other researchers wishing to use it, laying the foundations for a Semantic Web for linguistics.