A Critique of Statistical Machine Translation

Andy Way

doi:10.52034/lanstts.v8i.243

A Critique of Statistical Machine Translation

Authors

Andy Way Dublin City University

DOI:

https://doi.org/10.52034/lanstts.v8i.243

Keywords:

Statistical Machine Translation, Phrase-Based Statistical Machine Translation, Corpus-based Machine Translation, Rule-Based Machine Translation, Example-Based Machine Translation, Machine Translation Evaluation, Syntax, Machine Translation

Abstract

Phrase-Based Statistical Machine Translation (PB-SMT) is clearly the leading paradigm in the field today. Nevertheless—and this may come as some surprise to the PB-SMT community—most translators and, somewhat more surprisingly perhaps, many experienced MT protagonists find the basic model extremely difficult to understand. The main aim of this paper, therefore, is to discuss why this might be the case. Our basic thesis is that proponents of PB-SMT do not seek to address any community other than their own, for they do not feel any need to do so. We demonstrate that this was not always the case; on the contrary, when statistical models of trans-lation were first presented, the language used to describe how such a model might work was very conciliatory, and inclusive. Over the next five years, things changed considerably; once SMT achieved dominance particularly over the rule-based paradigm, it had established a position where it did not need to bring along the rest of the MT community with it, and in our view, this has largely pertained to this day. Having discussed these issues, we discuss three additional issues: the role of automatic MT evaluation metrics when describing PB-SMT systems; the recent syntactic embellishments of PB-SMT, noting especially that most of these contributions have come from researchers who have prior experience in fields other than statistical models of translation; and the relationship between PB-SMT and other models of translation, suggesting that there are many gains to be had if the SMT community were to open up more to the other MT paradigms.

Downloads

Download data is not yet available.

Downloads

Published

25-10-2021

How to Cite

Way, A. (2021). A Critique of Statistical Machine Translation. Linguistica Antverpiensia, New Series – Themes in Translation Studies, 8. https://doi.org/10.52034/lanstts.v8i.243

Download Citation

Issue

Vol. 8 (2009): Technology evaluation

Section

Articles

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under the CC BY-NC 4.0 Deed that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal. The material cannot be used for commercial purposes.

Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.

Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

A Critique of Statistical Machine Translation

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Information

linkedin

Browse