Automated error analysis for multiword expressions: Using BLEU-type scores for automatic discovery of potential translation errors

Bogdan Babych, Anthony Hartley


We describe the results of a research project aimed at automatic detection of MT errors using state-of-the-art MT evaluation metrics, such as BLEU. Currently, these automated metrics give only a general indication of translation quality at the corpus level and cannot be used directly for identifying gaps in the coverage of MT systems. Our methodology uses automatic detection of frequent multiword expressions (MWEs) in sentence-aligned parallel corpora and computes an automated evaluation score for concordances generated for such MWEs which indicates whether a particular expression is systematically mistranslated in the corpus. The method can be applied both to source and target MWEs to indicate, respectively, whether MT can successfully deal with source expressions, or whether certain frequent target expressions can be successfully generated. The results can be useful for systematically checking the coverage of MT systems in order to speed up the development cycle of rule-based MT. This approach can also enhance current techniques for finding translation equivalents by distributional similarity and for automatically identifying features of MT-tractable language.


automated error-analysis; multiword expressions; BLEU; automated metrics; concordance; concordance-based evaluation of Machine Translation; MT-tractability

Full Text: