The goal of this research project is to make a qualitative leap in the speed and accuracy of machine translation (MT) between human languages. The overall strategy is to marry novel schemes for representing structured knowledge about translational equivalence with recent and upcoming advances in machine learning. The use of machine learning offers the practical advantages of broad coverage, graceful degradation, and rapid adaptation to new domains of discourse. This work embraces the exciting recent strides forward in machine learning, and extends them to make structured predictions from structured data. At the same time, the research is grounded in sound knowledge representation schemes that can accommodate and exploit whatever information can be efficiently acquired on a large scale. When properly optimized, rich and extensible knowledge representation schemes enable MT systems to make important linguistic generalizations, leading to more accurate predictions in a wider variety of contexts.
Advances in this field have the potential to promote communication and understanding between different cultures, grease the wheels of international commerce, further the national security interests of the U.S. and its allies, and save lives by facilitating communication during natural or man-made catastrophes. The research results will be widely published, and the resulting software will be made publicly available. In addition, the research will be accompanied by a comprehensive educational program that includes new curricular materials, conference workshops, and ample opportunities for students at all levels to learn by doing.