This application addresses broad Challenge Area (06) Enabling Technologies and specific Challenge Topic 6- LM-101* Intelligent Search Tool for Answering Clinical Questions. Technology is developed to answer questions in English, where the answer must be inferred from information fragments distributed among multiple and disparate knowledge resources, including databases, software, and formal ontologies. The approach is illustrated by the development of a system to answer questions on HIV drug treatment, with an emphasis on drug resistance. Natural language technology is used to parse English-language questions and map them into a logical form. This form is passed to a reasoning system that is provided with an axiomatic subject domain theory, a collection of declarative sentences expressed in a symbolic, logical language over a pre-specified set of symbols. The reasoning system regards the logical query as a theorem and attempts to prove it;informed by the axioms of the theory, it transforms the query and decomposes it into subqueries, each of which may be answered by invoking one or more external knowledge resources. Answer fragments provided by the resources are then composed into answers to the original English question. The axioms of the domain theory serve to define the meaning of the symbols of the logical language, to describe the capabilities of the knowledge resources, and to provide the background knowledge necessary to link them together. When the English language query is ambiguous, information provided by the theory serves to resolve those ambiguities, by throwing away parses that don't make sense in the theory. When multiple parses remain, the clinician may be asked to choose among a set of unambiguous English paraphrases. The reasoning system can also initiate queries so the clinician can resolve ambiguities. A knowledge resource is invoked if it is linked to a symbol in the subject domain theory that appears in the proof search. The knowledge resources consulted during this process are heterogeneous;they have been developed by different researchers at diverse institutions and they cannot be assumed to have adopted uniform representational conventions. Rather, we rely on external resources whose purpose is to translate the information produced by one resource into the form required by another. The logical proof from which the answer is composed serves as the basis for generating an English language explanation and justification for the answer;it also serves to identify the sources of that answer. The external knowledge resources include clinical and research records of HIV treatment, including drug resistance;knowledge about HIV viral mutations;and online formal ontologies for HIV research;some of these have been developed by the research personnel of this project.

Public Health Relevance

The project proposed will create software that will enable clinicians and research scientists to ask queries in English and receive answers that are computed or inferred from multiple heterogeneous online knowledge resources, along with explanation, justification, and indication of provenance of the answers constructed. The researcher does not need to have any prior knowledge of the sources from which the information is obtained. This will enable clinicians to detect new relationships and facilitate the making of new discoveries in the area of HIV drug resistance and, ultimately, other areas of medicine and health.

National Institute of Health (NIH)
National Library of Medicine (NLM)
NIH Challenge Grants and Partnerships Program (RC1)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BBBP-J (58))
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Sri International
Menlo Park
United States
Zip Code