This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).
The University of Arizona is awarded a grant to develop and evaluate a set of algorithms/software to help computers to read and ?understand? taxonomic descriptions of plants, animals, and other living or fossil organisms. The major functions of the algorithms/software include 1) annotate large sets of text descriptions in a machine-readable way to support various knowledge applications, including producing character matrices and identification keys for various taxon groups. The algorithms do not need large sets of training examples or hand-crafted regular expression or grammar rules; this makes the algorithms easy to use with different document types and/or subjects; 2) generate interactive identification keys for various taxon groups with minimal human intervention; 3) discover and verify the use of ontological concepts and relationships as represented in the descriptions to improve coverage and literary warrant of domain ontologies; and 4) produce high quality, reusability resources (e.g. lexicons, benchmarks, etc.) for future bioinformatics research. The broader impacts of this work include: support community-based informatics activities (e.g. ontology development, document annotation and retrieval); all software produced by the project will be made open source, along with documentation and training materials; reusable knowledge entities (glossaries, lexicons etc.) generated by the project will be conveniently accessible via Web services; annotated document collections will be accessible for research, education, and the public. In addition, Web-based education modules will be developed to spark interests of K-12 students in taxonomy and introduce them to scientific ways of collecting, observing, describing, and identifying ant specimens, and lastly, training opportunities and a mentoring program for graduate level students interested in digital information management will be created, using the software and document collections provided by the project. Further information about this project may be found at the PI's website at http://sirls.arizona.edu/cui.