A major bottleneck in present-day machine translation (MT) is identifying appropriate approximations when no exact translation exists between the source and target languages. An MT system must resolve mismatches either by incorporating implicit information from context or leaving out some information in the source text. A prototype MT system is implemented, incorporating the usual MT steps, analysis (analyzing English sources into an English-oriented semantic representation), transfer (transferring the English-oriented semantics into a Japanese-oriented semantics), and generation (building Japanese targets from the transferred semantics), and adding a novel Mismatch Resolution Module (MRM) called when generation fails. In this architecture, both analysis and generation modules are purely monolingual, and transfer is simplistic, incorporating minimal context information, and the bulk of mismatch resolution and disambiguation is done in the generation-MRM loop, where the MRM offers solutions to the problems encountered by generation. Two kinds of MRMs are explored, logical and statistical, with the design goal of a single MRM combining the advantages of both approaches. Translations are evaluated by monolingual English speakers applying evaluation measures modeled on those of the DARPA MT program. The focus is on translating the Japanese joint venture business articles in the MUC-5 corpus into English. This project attacks the crucial mismatch resolution problem with a novel architecture, logical and statistical techniques, and on-line text resources.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9628880
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
1996-10-15
Budget End
2001-07-31
Support Year
Fiscal Year
1996
Total Cost
$394,985
Indirect Cost
Name
Sri International
Department
Type
DUNS #
City
Menlo Park
State
CA
Country
United States
Zip Code
94025