Recent years have witnessed significant improvements in automatic machine translation systems. An enabling factor for this short turnaround time is the adoption of automatic evaluation metrics, which serve to guide developers to improve translation systems. However, current metrics are a coarse approximation of human judgments for system outputs. This SGER project is exploring ways of improving current metrics. It does so on three fronts. First, judgments of syntactic similarities between system outputs and human references are being incorporated into the metric in addition to phrasal similarities. Second, machine learning techniques are being applied to find characteristics that are correlated to human judgments of quality. Third, the project is determining more fine-grained analyses of what is wrong as well as what is right with system outputs based on syntactic as well as lexical characteristics of the system outputs. In developing a learnable, syntax-aware metric, discriminative methods will be generalized for complex, structured problems. A more informative automatic evaluation metric will lead to more rapid improvements in machine translation technology, enabling access to archives from centuries past and documents from around the world. All relevant software and non-proprietary data will be distributed widely. Written reports of experimental results and findings will be disseminated through academic publications.