New efforts? A competence-oriented task analysis of interlingual live subtitling

New efforts? A competence-oriented task analysis of interlingual live subtitling

Franz Pöchhacker
University of Vienna, Austria


Aline Remael
University of Antwerp, Belgium


This article offers a theoretical analysis of interlingual live subtitling as a translational task with the aim of defining the skill set and competence profile to be developed by future practitioners. The outcome of this task analysis will thus serve to guide curriculum design both for training in interlingual live subtitling and for developing task-specific teaching methods. While taking account of the findings of empirical studies conducted in the context of the EU-funded Erasmus+ project, Interlingual Live Subtitling for Access[1] and related research, our line of argument is essentially deductive: models of the interpreting process and of translational competence are our essential points of departure. Our process analysis of  interlingual live subtitling, which refers to the Effort Model, will identify subprocesses and subskills. The latter will be put together in a competence model as a framework that can be filled with specific learning outcomes for training in  interlingual live subtitling.

Keywords: live subtitling, intralingual, interlingual, translation, simultaneous interpreting, competence, Effort Model, process analysis

1. Introduction

The social transformations often subsumed under the heading of glob­alization, together with the development of ever-advancing technological capabilities, have given rise to new needs for com­munication within and across sociocultural contexts. As societies have come to acknowledge various forms of social diversity and made efforts to cater to the special needs associated with them, new methods for overcoming barriers to communication have been developed. This is particularly evident in the area of audiovisual mass media, where long-established methods for making broadcast media content accessible across language barriers – such as subtitling, dubbing, voice-over or simultaneous interpreting (SI) – have been complemented by special services to ensure sensory accessibility such as audio- description (AD) for persons with visual impairments or subtitling for the deaf and hard of hearing (SDH). While some of these media accessibility services, such as AD, are new, others have been adapted from existing techniques and offered in new ways. The use of subtitles is a case in point. Rather than using them to render foreign-language audio content as written target-language text, subtitling is done intralingually to make speech (and other audible signals) available in the written modality of the same language. Moreover, the use of prepared (pre-recorded) subtitles is complemented by subtitles produced in real time so as to make live broadcasts accessible. Such intralingual live sub­titling of TV broadcasts has now become standard practice for many broadcasters, often in response to legal requirements. The interlingual form of this task, in contrast, is still very new, and has yet to be widely adopted. It is therefore the purpose of this article to engage in a more thorough analysis of this novel task, referred to here as interlingual live subtitling (ILS), and identify the competence(s) required for its successful performance. The resulting competence profile should serve as one of the starting points for designing a training programme for professional ILS.

Drawing up the competence profile of future ILS professionals presupposes a thorough understanding of the task with regard to both its process and skill components and the external demands and constraints on its actual performance in a given institutional and social setting. Whereas the broader issues of practical application will be dealt with in a subsequent phase of the ILSA project, of which this work forms a part, our focus here is on the process-based identification of competence requirements as a prerequisite for curri­culum design.

As indicated above, ILS is a new type of task. It essentially con­sists of the real-time rendering of a spoken source-language (SL) utterance into a written target-language (TL) text. Most typically, this is used in live TV broadcasts to make commentary or other audio content in another language available to viewers in the form of on-screen subtitles. In a broader sense, and also as envisaged in the ILSA project, such real-time speech-to-text translation can also be performed at live events, with written text appearing in block mode or as scrolling subtitles or surtitles with PowerPoint, or as scrolling running text on a separate screen. Although often referred to as speech-to-text interpreting (Stinson, 2015), this type of service has so far been offered mainly in intralingual mode.

Clearly, the novel task of ILS shares some common ground with other communication-enabling services such as (interlingual) subtitling and interpreting, and in particular with (intralingual) live subtitling as a major form of ensuring media accessibility. Aside from its similar designation, ILS and intralingual live subtitling share the technique by which real-time text production is achieved. This is generally referred to as “respeaking” and consists in repeating (and often rephrasing and condensing) the original while listening to a speech recognition system which turns the recognized utterances into written text (Remael, Van Waes, & Leijten, 2014; Romero-Fresco, 2011). The fact that this rephrasing, with punct­uation, is done interlingually in ILS raises some terminological issues that are discussed further below. At this point, we simply highlight the hybrid nature of ILS as a task, which strongly suggests that our effort to identify the competence requirements for successful task completion should be informed by insights from such related fields as (Audiovisual) Translation, Media Accessibility and Interpreting Studies. This is illustrated in Figure 1, which shows ILS at the interface of different fields or disciplines with whose tasks it shares important features: like prepared subtitling in audiovisual translation (AVT) and live (intralingual) subtitling in media accessibility, ILS is a speech-to-text process; like standard prepared subtitling and interpreting, ILS is interlingual; and like interpreting and live subtitling, ILS is performed in real time (for a comparable illustration that focuses on skill components, see Dawson, 2018).

Against this background, we will pursue our goal of drawing up a competence model for ILS by engaging in a more thorough analysis of the task according to its purpose, processing steps and techniques before identifying the set of cognitive resources involved in and required for its performance. First, though, we attempt to raise some conceptual and terminological issues surrounding the notion of ILS (section 2). Section 3 is then devoted to our process model of the task, followed by our sketch of a competence profile for ILS in section 4. In the subsequent discussion section, our largely deductive modelling efforts are related to available empirical findings so as to move towards an evidence-based conceptual model of ILS and highlight areas in need of further research.

Figure 1 ILS at the interface of tasks and disciplines.

2. Concepts and terms

A basic terminological qualification is in order here regarding some of the labels used for illustration in Figure 1, such as AVT, accessibility and live subtitling. Rather than stable, clearly defined concepts, these notions have been undergoing change, convergence and diversification. The field of Media Accessibility, for instance, has increasingly converged with accessibility in the broader sense as well as with AVT (Remael, Orero, Black, & Jankowska, 2019), and subtitling in AVT is also done intralingually. By the same token, the boundaries of interpreting are being extended beyond the spoken and signed modalities so as also to include written target texts (Pöchhacker, 2019). Thus, as suggested by Dam and Zethsen (2019), our understanding of these concepts often relies on prototypes, which are of course subject to changing social uses and professional practices.

A prototypical understanding of live subtitling also underpins the ILSA project. This concerns foregrounding ILS in TV broadcasts, even though live events will also be covered in the project, and focusing on live subtitling using speech recognition (SR) technology (“respeaking”). While the former is reflected in the way the task is labelled, “subtitling” may be too closely associated with audiovisual media to do justice to ILS for live events. Strictly speaking, the term speech-to-text interpreting (STTI), which refers to an essentially intralingual communication service for deaf and hard-of-hearing persons (Stinson, 2015), would be a more appropriate hyperonym, and our process model is indeed conceived of in these terms. This also prompts the question of how to label the end-product of the process, which is some form of written text. The term “subtitle”, like “surtitle”, suggests a particular positioning in relation to the media screen, which may not be the case for live events using separate screens or displays on mobile devices or even glasses. We therefore suggest using the term “live-title” for the textual end-product in ILS. This ensures a clear distinction from prepared subtitles while retaining the word form[2] traditionally used in media settings, also to refer to those carrying out the task: “live-titlers” could be working either intra- or inter­lin­gually, and their output could be displayed in various different media and positions.

In a similar vein, the interlingual nature of ILS – which, as a form of “live translation”, is akin to interpreting as defined by Kade (1968) – seems inadequately captured by the term “respeaking”, which is closely associated, both semantically and in professional practice, with an intralingual language-processing activity. In order to reflect the translational nature of ILS, we suggest adopting the term “transpeaking”, which was introduced in this context for pragmatic purposes by Patricia Martínez Zapico in 2011.

With these conceptual clarifications and terminological proposals, we now proceed to analyse ILS as a process and describe the structure and the skill components of the task.

3. Process analysis

While ILS, as conceived in the present article and in the ILSA project as a whole, may appear to be a novelty, it can in fact be traced to earlier experiments and practices. One is an experiment in live interlingual subtitling on Austrian television in the late 1980s. As reported by Kurz and Katschinka (1988), this involved two English-speaking participants in an arts programme whose contributions were made accessible to German viewers in the form of subtitles. These were produced by a team comprising simultaneous interpreters and a media professional (subtitler). The interpreter would produce a compressed German rendering of the SL utterance and her output would then be typed and “spotted” live on the programme by the subtitler. The problem of the extended time lag resulting from this two-step procedure is obvious and it may explain why the experiment had no follow-up ‒ unlike another forerunner project in the Netherlands. The Dutch public broadcaster (Nederlandse Omroep Stichting, NOS) first trialled live interlingual subtitling in the late 1990s, using specially trained Velotype subtitlers (den Boer, 2001). Subsequent efforts involved a broadcast delay of 20 to 30 seconds and produced satisfactory perfor­mance, notwithstanding some loss of content. Rather than simultaneous interpreters, NOS relied on teams of two professional subtitlers (translators), one of whom would “interpret” and the other type, the pair handing over to another team after some 10 to 15 minutes (de Korte, 2006).

De Korte (2006) explicitly mentions the option of using SR technology as a way of enhancing the live subtitling method. While this has increasingly become common practice for (public) broadcasters in the Netherlands and in Flanders, de Korte’s early account anticipates a key issue to be addressed in the ILSA project, namely, the skill set most suitable for the task. The fact that NOS decided against using interpreters at the time because they “did not want every single word translated” (de Korte, 2006), whereas Kurz and Katschinka (1988) speak of a “compressed” interpretation, points to a high degree of con­vergence between the relevant skills of subtitlers and interpreters. Hence the need for a more detailed analysis of the process.

3.1 Task description

As a communication-enabling service in response to specific social needs, ILS must be described, first and foremost, in terms of its purpose, users and contextual constraints. However, the hybrid nature of the task makes this rather difficult: viewed as an extension of AVT and interpreting, ILS targets viewers with an insufficient understanding of the SL used in the broadcast; as an interlingual variant of SDH, on the other hand, ILS makes media users with sensory impairments the prime target group(s). In addition to this variation even within the TV setting, the use of ILS, or STTI, in live events, in­clud­ing educational settings, brings in yet another set of user groups with specific communicative needs and contextual constraints. These different scenarios of application will be described more fully as part of the ILSA project; for present purposes, we therefore concentrate on the process and competences for ILS aimed at TV audiences.

3.2 Process model

As in the early experiments mentioned above, ILS as a task cannot be accomplished in a single step. Rather, it is a multi-step process involving a primary phase in which SL audio content is rendered in the TL by a transpeaker, followed by a secondary phase in which the transpeaker’s output is turned into written text by an SR system. In the prototypical TV set-up the transpeaker, like the respeaker, listens to the audio input through a headset and uses a microphone to rephrase this input to a computer with respeaking and subtitling software, which together turn the spoken input into written subtitles. These draft subtitles can then usually still be edited using the computer keyboard before they are broadcast, although this is not always the case. In live settings the subtitling software is often replaced by captioning software such as Text on Top® that projects the written output of the respeaking process onto a screen in the conference room. In this scenario, post-respeaking editing will often be visible to the viewers.

In a third phase of the transpeaking process, the SR output is moni­tored and, if necessary, corrected by manual keyboard input just before it is made available to target viewers. This editing phase may be in the hands of a second person; we will refer to this as the Duo TS model (analogous to the Duo LS model in Remael et al., 2014), but it can also be performed by the transpeaker in a Mono TS model.[3] Thus, the three-step process of ILS can be accomplished by a single individual working in tandem with an SR system. This basic process is illustrated in Figure 2.

Figure 2 Basic ILS (speech-to-text) process.

As indicated by the triangular shape on the right, the entire process is driven by the goal of giving a particular target audience access to what is spoken (and heard) in the (audiovisual) source text (ST). Textual components are found at either end of this goal-oriented process as well as in between: the former are the (spoken) ST and the written target text (TT); the latter, intermediary texts are the transpeaker’s output and the written output of the SR system, which is depicted as a cross-cutting stretch that feeds into the TT, either with or without prior editing. Aside from the final editing phase, the interface between the transpeaker’s output and the recognized text is a crucial point in the process: it is evidently shaped by the human agent as much as by the capabilities of the software. The initial transpeaking phase, in contrast, depends solely on the transpeaker, but is nonetheless marked by a high degree of com­plexity.

Whereas the ILS process as a whole is obviously different from the process of SI, the transpeaking phase is essentially an interpreting task, albeit with specific requirements. In line with Kade’s (1968) definition, the SL message is available only once, and there is very little opportunity to correct or revise the (spoken) TL text if the rendering is truly live.[4] However, the transpeaker’s output must be geared not to human listeners but to the capabilities and settings of the software in such a way as to ensure written text output that can be read and understood at a glance. This alters the production part of the transpeaking process, over and above the strategic processing require­ments arising from the time pressure at either end of the ILS process: whereas the transpeaker will probably need to cope with a high audio input rate, the speed of TT presentation is constrained by the processing capacity of the software and by the target audience’s reading-time needs, and the TT’s physical form depends on the space available. In many circumstances, this will result in the need for strategic compression, as described also for conventional SI.

Despite these special features, the shared ground between tran­speaking and SI should amply warrant an approach inspired by Gile’s Effort Models of interpreting (Gile, 2015), which are alluded to in the title of this article. Clearly, the basic components of the Effort Model for SI must also be accounted for in a process model of transpeaking, but we make one exception: Gile’s “Memory Effort” refers to the need for short-term storage, which is generally considered to be a function of working memory. But current conceptions of working memory also include information processing and retrieval and executive control functions (Timarová, 2015), and relate the construct of working memory to attentional resources interacting with (long-term) memory. Accordingly, we would assume that the entire transpeaking process (depicted in the shape of a right trapezoid) draws on available attentional (or working memory) resources along the lines of Gile’s original notion of mental “energy” or processing capacity.

Of the three remaining “efforts” in Gile’s model, we would adopt two with only a slight change in the labelling: listening comprehension (Gile’s “L”) and coordination and control (“C”). Gile’s “P” Effort, on the other hand, is very holistic and involves all the components of the production process that are distinguished, for example, on the output side of Setton’s (1999) process model of SI. Given the special demands on the transpeaker’s output, which serves as input to the second, software-based phase of the process, we prefer to make the output process more explicit by distinguishing three components of production:

(1) “strategic refor­mula­tion”, conceived as a cognitive subprocess on a par with listening comprehension that corresponds to the “Formulator” and “Parser” com­ponents in Setton’s (1999) process model of SI;

(2) “dictation”, understood as a specific software-adapted style of articulation, which also includes the verbalization of punctuation, speaker change and relevant auditory information, which is only occasionally still added through typing; and

(3) (auditory) “monitoring”, which is particularly consequential at the inter­face with the automatic recognition process.

Beyond the transpeaking process, monitoring ‒ of visual text ‒ is also an important component process in the editing phase, where it may lead to keyboard-based intervention to correct the SR output and give it its final form. When such monitoring and correction is carried out by the transpeaker, there is an additional “coordination and control” (C&C) component that spans the (auditory–oral) transpeaking and (visual–manual) editing phases of the process. Thus, in addition to the “effort” of coordinating – vertically, as it were – the simultaneous subprocesses of the transpeaking phase, ILS requires an additional ”horizontal” C&C effort that arises from the added task requirement of real-time editing.

Considering the cognitive complexity of ILS, it seems justified to devote special attention to the core process of the task. Nevertheless, and especially with a view to a comprehensive model of competence require­ments, the cognitive micro-process between ST input and TT output must be complemented by a broader view of ILS as a professional course of action. Here again, reference can be made to existing (macro-)process models of interpreting, such as the four-fold distinction by Kalina (2000, p. 126) between pre-process requirements, peri-process conditions, in-process requirements and post-process efforts. Kalina’s (2000) scheme, which was developed to account for factors in the process of quality assurance, was subsequently adapted by Albl-Mikasa (2013, p. 19) to take into account the competence requirements for pro­fessional conference interpreters as elicited in an interview-based study. This scheme will stand us in good stead in developing the ILS competence profile in section 4.

Both Kalina (2000) and Albl-Mikasa (2013) include “preparation” as a main re­quirement in the pre-process phase. Whereas any prior learning and training relevant to task performance could conceivably come under this heading, ILS with SR in this respect includes the fundamental need to ensure optimum interaction with the SR system. This ranges from hardware settings and account creation to a fully developed SR profile and a list of macros and house styles. More specifically, with regard to a given assign­ment, preparation includes both thematic research into the topic and programme type (including the target audience) and linguistic and termino­logical research. In addition, the result of thematic and termino­logical research must literally feed into the SR system by way of document uploads and additions to the terminological database.

In the peri-process phase, special attention is given to teamwork and cooperation, and this applies to the teamwork of live-titlers no less than to simultaneous interpreters in a booth. Post-process tasks, finally, include debriefing with team members to identify issues to be resolved; accuracy assessment in the broader context of quality management, and remedial work to eliminate errors and weaknesses occurring in future assignments – for instance, by adding terminology to the SR database or further training in the use of the SR software.

This brief account of the ILS process and its subcomponents, both in-process and pre-, peri- and post-process, is summarized in Figure 3, and will serve as our point of departure for identifying competences and skills in the following section.

Figure 3 Process model of ILS.

4. Competence profile

4.1 Definitions and models

Although “competences” are a staple of the didactic literature on trans­lation and interpreting, a clear-cut definition of the term remains elusive. On the one hand, it is an umbrella concept, defined, for instance, in the context of the European Qualifications Framework, EQF (2008) as “the proven ability to use knowledge, skills and personal, social and/or methodological abilities, in work or study situations and in professional and personal development” (p. 11). On the other hand, Albl-Mikasa (2013), on the subject of interpreting, writes that competence is “a general term for everything an interpreter needs to know and be able to do to perform a professional task” (p. 19), which she then goes on to discuss in greater detail in terms of “skills”.

For translation more generally, Robert, Remael and Ureel (2017) point out that translation competence was originally considered to be mostly a linguistic competence, but that it is now “generally recognized and conceptualized as a complex construct consisting of different sub-competences, that is, as a multicomponential competence” (p. 1). The above description of the tran­speak­ing process demonstrates that transpeaking, too, is a construct that comprises multiple components. However, the question is this: Which sub-competences are translation and interpreting composed of or, in our case, which competence is transpeaking composed of? In Translation Studies the idea of a multi­componential translation competence has given rise to a growing body of translation competence models (Göpferich, 2009; Hurtado Albir, 2017) as well as the EMT Competence Framework (EMT, 2017) (see Robert et al., 2017 for an overview). However, similarly comprehen­sive models for (simultaneous) interpreting seem not to exist. Albl-Mikasa’s (2012) “process- and experience-based model of interpreter competence” (p. 63) lists the major “skills” that interpreters themselves perceive are essential for their trade, and echoes the SI skills and competences mentioned in other publications (see Grbić & Pöchhacker, 2015). We therefore propose our own multicomponent model for transpeaking, but one that finds its inspiration in the literature mentioned above.

In order to arrive at a workable definition of competence for our transpeaking context, we draw on Robert et al. (2017) and use “com­petence” as a very broad notion that can refer to cognitive resources of three different kinds: declarative knowledge (knowing what), procedural knowledge or skills (knowing how),[5] and socio-psychological resources, such as having the willingness and ability to work in a team. The set of competences posited below may therefore involve the different types of knowl­edge and ability which live-titlers will draw on at different stages of the process.

4.2 Competence model

The model outlined below and illustrated in Figure 4 shares a number of components with existing competence models for translational activity, but it also features (sub-)competences that we consider unique to the task of ILS. Not surprisingly, special competence in the realms of language and culture is considered fundamental, as is a rich repertoire of knowledge that includes both general (“world”) knowledge and domain-specific knowledge of the subject-matter at hand. Other types of competence found in most models include socio-psychological competences, including those relating to interpersonal relationships, and the knowledge and skills required for service provision in a professional context, not least in the case of freelance work. The more unique components of our com­petence model, on the other hand, are closely linked to the nature of the task, and will be characterized as technical and methodological.

Figure 4 ILS process and competence model.

The five main components of the model – specified as linguistic and cultural competence, world knowledge and subject-matter competence, technical–methodological competence, (inter)personal competence and professional competence – are explained in more detail below and are related to particular component processes of the task. At a subsequent stage, the ILS competence profile will serve to inform the design of a curriculum, which is the principal aim of the ILSA project. For this purpose, the various competences and skills will have to be translated into learning outcomes, defined by Kennedy, Hyland and Ryan (2009) as “statements of what a learner is expected to know, understand and/or be able to demonstrate after completion of a process of learning” (p. 5). This will entail operationalizing the competences as actions that can be taught through concrete learning activities and assessed at different levels.

In our analysis of the cognitive resources that interlingual live-titlers must draw on, and starting from the process model represented in figures 2 and 3, we first introduce the multicomponential competence that distinguishes transpeaking from related practices in that it is the unique core competence of our model: technical–methodological competence. The cognitive resources it includes also have an impact on the particular form that other competences take and play a determining role in most, if not all, of the stages of the transpeaking process. Specifically, we distinguish six sub-competences, some of which focus on (declarative) knowledge, whereas most involve operational competences besides some socio-psychological ones.

The first sub-competence is mainly declarative knowledge of the transpeaking task and process. Transpeakers must have a full under­standing of the entire speech-to-text process and its function within a specific communicative setting (whether broadcasting or live settings) and also of the process-related requirements that arise from these different con­texts.

The second sub-competence relates to research and preparation. It consists of the knowledge and skills that are needed in the pre-process stage when transpeakers prepare an assignment by extending and activating their knowledge base and by fine-tuning the database in the SR software.

The third sub-competence, simply labelled translation, is at the core of the technical–methodological competence and encompasses all the stages of the transpeaking process. It combines traits similar to those found in SI, intralingual live subtitling and interlingual subtitling, with specific additional features. The listening comprehension skills are similar but not identical to those required for interpreting and para­phrasing. Transpeakers also work from auditory source input that is available only once, but TV audio can be very different and can include overlapping dialogue, for instance. In any case, strategic reformulation requires the ability to distinguish quickly between what is essential and what is not (e.g., when aural input accelerates or varies in speed) and even to improvize, as in the case of poor audio quality or other technological glitches. Uniquely, the reformulated output must be artic­ulated for a non-human recipient in a way that takes the particu­larities of SR and of the subtitling software into account, thus ensuring optimum recognition and appropriate formatting of the written text. Whereas the translation sub-competence draws on linguistic compe­tence to ensure correct spelling, grammar and punctuation, it also includes the motor skill of using the keyboard to input information that cannot easily be verba­lized, such as using colours to identify different speakers.

The fourth sub-competence, multi-tasking, is of an eminently pro­cedural nature. It is needed to ensure that an individual can cope with multiple tasks and manage processing efforts in such a way as to avoid overload or even breakdown. Multi-tasking in transpeaking is unique in that it involves the simultaneous activation of various cognitive and/or psychomotor pro­cesses throughout the three main stages of the transpeaking process. As in SI, it requires listening comprehension and strategic reformu­lation to be juggled with self-monitoring one’s audio output; in addition, and similarly to real-time intralingual subtitling, it involves concurrent psychomotor skills (eye-ear-hand coordination) for occasional typing to edit the written output. Beyond these cognitive processing tasks, multi-tasking may also extend to coordinating teamwork when a task has to be accomplished by two professionals working in tandem.

Sub-competence number five is audiovisual monitoring. This is required in order to scan audiovisual texts for visual information beyond the verbal and the non-verbal auditory input and to monitor one’s own or a team mate’s written output. Such output may have to interact with other visual information on the screen and non-verbal information in the audio. As mentioned above, technical–methodological competence also includes the sub-competence of editing. The knowledge and skills required may differ considerably in quantity and quality, depending on the setting in which the transpeaker is operating and the type of software that is used.

In addition to the core competence of ILS with its six sub-com­petences, the model features four competences that more closely resemble those associated with the related tasks and domains visually represented in Figure 1. Nevertheless, the nature of these more generic competences will, of course, be shaped by the different components of the technical–methodological competence as described above.

As in most models relating to mediated communication, linguistic and cultural competence, which is of both a declarative and a proce­dural nature, must be posited as a basic requirement. In ILS, linguistic proficiency is essential for ST comprehension but it plays an even greater role in the production and editing stages in the TL, which require the correct use of word forms, grammar, spelling and punctuation and due regard for register and textual cohesion. Working with and between two languages also requires familiarity with the respective sociocultural systems of reference. In transpeaking, cultural competence must be particularly rich for receptive processing, given the highly varied speaking styles and cultural backgrounds encountered in programmes to be subtitled. At the same time, this competence must also be sufficiently developed in order for transpeakers to understand the more or less specific needs of potential communities of TL users, such as individuals with a hearing impairment.

No less indispensable are world knowledge and subject-matter com­petence, where the latter can be regarded as a subset of the former. Since there are limits to what can be prepared in the pre-process stage and fed into the terminology database of the SR system, transpeakers, like interpreters, must be able to draw on a vast store of general and special­ized knowledge and mobilize these resources with great effi­ciency. As highlighted by the activation of concepts and terms, there is considerable overlap between both world knowledge and linguistic compe­tence and between world knowledge and cultural competence, where culturally specific con­cepts and expressions are concerned. The same applies to the need for alternative contextually appropriate solutions when transpeakers do not expect the SR system to recognize certain terms.

In ILS, socio-psychological traits and requirements, which we refer to as (inter)personal competence, take a number of different forms. As in the performance of other highly demanding tasks, the ability to manage stress and the motivation to perform well constitute important demands on the individual. At the same time, interpersonal skills are particularly important for teamwork in duo-transpeaking, where teams alternate, but are also needed in collaborative efforts to prepare the SR software in the pre-process stage.

Finally, professional competence is considered a core requirement for translation service provision in any setting. However, it is difficult to say at this stage which sub-competences should be foregrounded for ILS, since the profession is only just emerging and the type of service provision may vary depending on the setting. Aspects of professional competence may range from compliance with an employer’s relevant guidelines and procedures to networking and marketing skills for free­lancers and to continuing professional development, not least regarding accessibility and digital technologies, in order to optimize one’s role in a complex overarching workflow – for instance, in a broadcasting company. Thus, as in the case of the other competences listed above, the specific set of professional competence requirements is interrelated to both the sub-competences of the technical–methodological core compe­tence and the organizational and institutional frameworks in which task performance is embedded.

5. Discussion and conclusion

In line with the aim of the ILSA project to develop a professional profile for ILS, we have sketched a first competence model for this novel task by undertaking a descriptive analysis of the process and identifying the competences required for successful performance. Although the relative importance of the competence areas in our model ‒ from linguistic and cultural to personal and professional skills ‒ is still difficult to determine, as it is subject to the particular form the task may take in a given situational and professional context and communicative setting, there is little doubt about the crucial role of the technical–methodological competence that we consider unique to the task. Its six sub-competences, in turn, are highly diverse and interrelate in a roughly cascading fashion, from global task understanding to editing skills, with multi-tasking and translation as key sub-competences.

Although our competence model is largely hypothetical, it is in line with initial empirical findings as presented by Robert, Schrijver and Diels in this volume: the “prerequisites” for successful ILS that were elicited in their questionnaire-based survey among professionals, trainers and service providers can all be subsumed under one or more of the competences we have distinguished. Admittedly, the definition and interrelation of the various competences and sub-competences must remain open to discussion, and much further research will be required in order to understand how they inform the various stages and components of the transpeaking process and the ILS task as a whole. Still, we hope to have succeeded in providing a conceptual foundation and a theoretical underpinning for the development of a training curriculum, for which the competences will be reformulated as concrete learning outcomes, and task performance will be contextualized with reference to specific com­municative scenarios, from TV broadcasts to live-event settings.


Albl-Mikasa, M. (2012). The importance of being not too earnest: A process- and experience-based model of interpreter competence. In B. Ahrens, M. Albl-Mikasa, & C. Sasse (Eds.), Dolmetschqualität in Praxis, Lehre und Forschung: Festschrift für Sylvia Kalina (pp. 59‒92). Tübingen: Gunter Narr.

Albl-Mikasa, M. (2013). Developing and cultivating expert interpreter competence. The Interpreters’ Newsletter, 18, 17‒34.

Dam, H. V., & Zethsen, K. K. (2019). Professionals’ views on the concepts of their trade: What is (not) translation? In H. V. Dam, M. N. Brøgger, & K. K. Zethsen (Eds.), Moving boundaries in translation studies (pp. 200‒219). London: Routledge. doi:10.4324/​978​13​1​5121871-13

Dawson, H. (2018, October). Identifying the task-specific skills required for interlingual respeaking: An em­pirical approach. Paper presented at Languages & the Media 2018: 12th International Con­ference on Language Transfer in Audiovisual Media, Berlin, Germany.

de Korte, T. (2006). Live inter-lingual subtitling in the Netherlands: Historical background and current practice. inTRAlinea Special issue: Respeaking. Retrieved from​in_the_​Nether​lands

den Boer, C. M. (2001). Live interlingual subtitling. In Y. Gambier & H. Gottlieb (Eds.), (Multi)media translation: Concepts, practices, and research (pp. 167‒172). Amsterdam: John Benjamins. doi:10.1075/btl.34.20boe

EMT. (2017). European Master’s in Translation: Competence framework 2017. Retrieved from

EQF. (2008). The European Qualifications Framework for Lifelong Learning (EQF). Retrieved from

Gile, D. (2015). Effort models. In F. Pöchhacker (Ed.), Routledge encyclopedia of interpreting studies (pp. 135‒137). London: Routledge.

Göpferich, S. (2009). Towards a model of translation competence and its acquisition: The longitudinal study ‘TransComp’. In S. Göpferich, A. L. Jakobsen, & I. M. Mees (Eds.), Behind the mind: Methods, models and results in translation process research (pp. 11–37). Copenhagen: Samfundslitteratur.

Grbić, N., & Pöchhacker, F. (2015). Competence. In F. Pöchhacker (Ed.), Routledge encyclopedia of interpreting studies (pp. 69‒70). London: Rout­ledge. doi:10.4324/​97​8​1​3​​15678467

Hurtado Albir, A. (Ed.). (2017). Researching translation competence by PACTE Group. Amsterdam: John Benjamins. doi:10.1075/btl.127

Kade, O. (1968). Zufall und Gesetzmäßigkeit in der Übersetzung. Leipzig: Enzyklopädie.

Kalina, S. (2000). Interpreting competences as a basis and a goal for teaching. The Interpreters’ Newsletter, 10, 3‒32.

Kennedy, D., Hyland, A., & Ryan, N. (2009). Learning outcomes and competences. In E. Froment, J. Kohler, L. Purser, & L. Wilson (Eds.), EUA Bologna handbook: Making Bologna work (B 2.3‒3). Berlin: Raabe.

Kurz, I., & Katschinka, L. (1988). Live subtitling: A first experiment on Austrian TV. In P. Nekeman (Ed.), Translation, our future: Proceedings of the XIth World Congress of FIT (pp. 479‒483). Maastricht: Euroterm.

Merriam-Webster. (2018). Title. Retrieved from​diction​ary/​title

Pöchhacker, F. (2019). Moving boundaries in interpreting. In H. V. Dam, M. N. Brøgger, & K. K. Zethsen (Eds.), Moving boundaries in translation studies (pp. 45‒63). London: Routledge.

Remael, A., Orero, P., Black, S., & Jankowska, A. (2019). From translators to accessibility managers: How did we get there and how do we train them? MonTI, 11, 131–154. doi:10.6035/MonTI.2019.11.5

Robert, I. S., & Remael, A. (2017). Assessing quality in live interlingual subtitling: A new challenge. Linguistica Antverpiensia New Series ‒ Themes in Translation Studies, 16, 168‒195.

Robert, I. S., Remael, A., & Ureel, J. J. J. (2017). Towards a model of translation revision competence. The Interpreter and Translator Trainer, 11(1), 1‒19. doi:10.1080/​1750​39​9​X.​2016.1198183

Remael, A., Van Waes, L., & Leijten, M. (2014). Live subtitling with speech recognition: How to pinpoint the challenges. In D. Abend-David (Ed.), Media and translation: An interdisciplinary approach (pp. 121‒147). London: Bloomsbury Academic.

Romero-Fresco, P. (2011). Subtitling through speech recognition: Respeaking. Manchester: St. Jerome.

Setton, R. (1999). Simultaneous interpretation: A cognitive-pragmatic analysis. Amsterdam: John Benjamins. doi:10.1075/btl.28

Stinson, M. S. (2015). Speech-to-text interpreting. In F. Pöchhacker (Ed.), Routledge encyclopedia of interpreting studies (pp. 399‒400). London: Routledge.

Timarová, S. (2015). Working memory. In F. Pöchhacker (Ed.), Routledge encyclo­pedia of interpreting studies (pp. 443‒446). London: Routledge.

[1] This research was carried out within the framework of the European ILSA Project (for further information see, though it was not part of the funded Intellectual Outputs of the project. We would like to thank our ILSA partners for their critical reading of the draft version of this article and their constructive input.

[2]    Merriam-Webster’s Dictionary (2018) defines one sense of the word “title” as “written material introduced into a motion picture or television program to give credits, explain an action, or represent dialogue”.

[3]    Whether one or two professionals are deployed may depend on the degree of difficulty of the program. In rare cases, set-ups with teams of up to four persons occur, which will no doubt turn out to be commercially unviable (Robert & Remael, 2017).

[4]    In semi-live subtitling, which can take different forms, SR can be corrected in a post-editing process, with specific skill requirements of its own.

[5]    It is important to note that skills (procedural knowledge) also include the ability to execute a given task.