Automated error analysis for multiword expressions: Using BLEU-type scores for automatic discovery of potential translation errors
DOI:
https://doi.org/10.52034/lanstts.v8i.246Keywords:
automated error-analysis, multiword expressions, BLEU, automated metrics, concordance, concordance-based evaluation of Machine Translation, MT-tractabilityAbstract
We describe the results of a research project aimed at automatic detection of MT errors using state-of-the-art MT evaluation metrics, such as BLEU. Currently, these automated metrics give only a general indication of translation quality at the corpus level and cannot be used directly for identifying gaps in the coverage of MT systems. Our methodology uses automatic detection of frequent multiword expressions (MWEs) in sentence-aligned parallel corpora and computes an automated evaluation score for concordances generated for such MWEs which indicates whether a particular expression is systematically mistranslated in the corpus. The method can be applied both to source and target MWEs to indicate, respectively, whether MT can successfully deal with source expressions, or whether certain frequent target expressions can be successfully generated. The results can be useful for systematically checking the coverage of MT systems in order to speed up the development cycle of rule-based MT. This approach can also enhance current techniques for finding translation equivalents by distributional similarity and for automatically identifying features of MT-tractable language.Downloads
Published
25-10-2021
How to Cite
Babych, B., & Hartley, A. (2021). Automated error analysis for multiword expressions: Using BLEU-type scores for automatic discovery of potential translation errors. Linguistica Antverpiensia, New Series – Themes in Translation Studies, 8. https://doi.org/10.52034/lanstts.v8i.246
Issue
Section
Articles
License
Copyright (c) 2021 Bogdan Babych, Anthony Hartley
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under the CC BY-NC 4.0 Deed that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal. The material cannot be used for commercial purposes.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).