Evaluating RBMT output for -ing forms: A study of four tar-get languages

Authors

  • Nora Aranberri-Monasterio Dublin City University
  • Sharon O‘Brien Dublin City University

DOI:

https://doi.org/10.52034/lanstts.v8i.247

Keywords:

Machine Translation, -ing words, controlled language, post-editing source text, automatic evaluation metrics, Machine Translation evaluation correlations, IT domain, commercial machine translation, RBMT

Abstract

-ing forms in English are reported to be problematic for Machine Transla-tion and are often the focus of rules in Controlled Language rule sets. We investigated how problematic -ing forms are for an RBMT system, translat-ing into four target languages in the IT domain. Constituent-based human evaluation was used and the results showed that, in general, -ing forms do not deserve their bad reputation. A comparison with the results of five automated MT evaluation metrics showed promising correlations. Some issues prevail, however, and can vary from target language to target lan-guage. We propose different strategies for dealing with these problems, such as Controlled Language rules, semi-automatic post-editing, source text tagging and “post-editing” the source text.

Downloads

Published

25-10-2021

How to Cite

Aranberri-Monasterio, N., & O‘Brien, S. (2021). Evaluating RBMT output for -ing forms: A study of four tar-get languages. Linguistica Antverpiensia, New Series – Themes in Translation Studies, 8. https://doi.org/10.52034/lanstts.v8i.247