Testing digital pen technology in a hybrid mode of interpreting

Conference interpreter training programs should explore the use of digital pen technology in the classroom and promote further research in this area.

In 2009, during a technology seminar on how new technologies could influence language education, I was introduced to new technology, the Livescribe digital pen, whose advanced features allow its users to record and capture simultaneously what they write in relation to what is said in the room. The device was originally invented to assist secretaries in their taking of minutes during meetings and in retrieving their notes, and has since been trialled and used for research projects in various fields such as education, engineering, health and allied health, journalism or science (e.g. Boyle 2012; Dawson & Plummer 2010; Greeve & McGee-Lennon 2010).

Immediately inspired by its unique features, I proceeded to trial the pen in the note-taking classroom for consecutive interpreting. Most research in the field of note-taking had so far focussed only on the interpreter’s notes as a product, rarely on the process of note-taking; the main reason being the limitations of the available technology and resources.  With its unique technical features allowing the simultaneous recording of both what is said in the room and what is written by the user, the pen offered for the first time ever the possibility to capture live the process of note-taking of an interpreter at work. I therefore developed pedagogical sequences for the note-taking classroom and have been using the device since then, at various stages of the training.

[For more information on training in note-taking using the Smartpen and feedback from trainees and educators, please refer to the following publications: Interpreting training and digital pen technology and Implementing digital pen technology in the consecutive interpreting classroom.]

Digital pen technology and Consec-simul

Stemming from and elaborating on the conclusion about the use of the technology in the classroom, I decided to test the amenability of the digital pen if used in a hybrid mode of interpreting, in a consecutive interpreting context, where both consecutive and simultaneous modes are mixed. Because of the specific recording features of the digital pen, I became interested in research on this hybrid mode of interpreting using the new device.

Most interpreters tend to find consecutive interpreting assignments which require the understanding, memorisation and note-taking of a speech rather difficult and stressful. For this reason, performance enhancing technology is a resource welcome by interpreters, especially if technology is available to reduce the strain on short-term memory retention (the memory in action between the moment a speech is heard and notes representing it taken). Technology-assisted interpreting has long been of interest to trainers, practitioners and students seeking to find ways of integrating technological applications to assist them in their everyday professional life.

In 1999, for example, Michele Ferrari, a European Commission interpreter, was the first professional interpreter to employ digital technology by recording the source speech of a commissioner, then playing it back from his digital recording device, and interpreting it simultaneously. For the first time, a consecutive interpretation was performed simultaneously.

This original approach to a “digitally remastered” consecutive interpretation and this new mode triggered lots of interest from researchers and, from then on, several studies were conducted. As indicated in Hamidi and Pöchhacker (2007, p. 277-278), various practitioners have trialled different tools to test the efficiency of digital assistance when performing a long consecutive interpretation. For example, Ferrari carried out tests at the DG Interpretation with various devices in 2002 and 2003 (p. 277). These initial trials were soon followed respectively in 2003 and 2005 by those of John Lombardi and Erik Camayd-Frexas, two American interpreters who found the technique very useful for court interpreting assignments (Lombardi 2003, Camayd-Frexas 2005). In 2006, Hamidi completed her Master’s thesis on the subject, carried out a study and collected data on the hybrid simultaneous consecutive mode, also called ‘SimConsec’. As reported by Pöchhaker (2012), other studies have been conducted since, especially by several masters’ students: Sienkiewicz in 2010, Hawel in 2010, Richter in 2010, and Hiebl in 2011.

As most attempts have shown, and as expressed by Hamidi and Pöchhacker (2007), the new simultaneous consecutive mode allows an “improvement in quality” (p.278) and “is praised for its increased accuracy and completeness” (p.278). Because “note-taking is no longer necessary [which] allows the interpreter to devote more attention to listening and comprehension” (p.278) it “permitted enhanced interpreting performances” and was “considered a viable technique” (p.288), despite some caveats about poor communication with the public. Indeed, even if the abovementioned studies have found an enhanced accuracy and completeness in the interpretations in the new mode, most have also pointed out a poorer audience contact and interaction during the simultaneous part of the task.

In 2013, I decided to carry out a study to compare the interpreting performance of interpreters who used the conventional consecutive interpreting mode and this new hybridised mode with the aid of the digital pen.

Previous experiments and studies conducted to investigate the relevance and viability of the hybrid mode, labelled it or referred to it in different ways, e.g. as “Digitally remastered consecutive” or “Technology assisted consecutive” (Ferrari, 2002), “DRAC – Digital recorder assisted consecutive” (Lombardi, 2003), “Digital voice recorder assisted CI” (Camayd-Frexas, 2005), or “SimConsec” (Hamidi and Pöchhacker, 2007). I opted for the term Consec-simul with notes (or shortened as Consec-simul) to underline the fact that the interpreter still works with a pen and paper, and therefore that notes are still possible. This label also reflects the combination of both modes, consecutive and simultaneous, and the way the interpretation unfolds. The steps involved in consecutive interpreting are: listening, understanding, memorising and note-taking; and the steps involved in simultaneous interpreting are: listening, understanding and simultaneously expressing the content in the target language.

Using Gile’s (1995) now familiar Effort Models, by which Gile conceptualises the interpreting act as a series of efforts to be coordinated and managed to perform well, the operating processes undertaken in the Consec-simul with notes mode could be mapped as follows:

Phase 1:     

Listening 1 and analysis 1

Short-term memory operations


Phase 2:          

Listening 2 and analysis 2

Short-term memory operations

Long-term memory operations (reconstructing the speech)

Note-reading/Retrieving information/Anticipation/

Operating the pen


During phase 1, the effort components are identical to those for a traditional consecutive performance except that the interpreter knows that he/she will hear the speech a second time and interpret it simultaneously, and that he/she will have the possibility of slowing down or speeding up the audio playback with the digital pen. The interpreter is therefore likely to take notes in a different way and perhaps focus more on the structure of the speech, or write prompts about the pen features to use at a certain time during the interpretation. This ‘anticipatory’ knowledge is likely to lead to more economical note-taking, with a focus on the macro-linguistic and structural elements of the speech.

During phase 2, the effort components that are usually required and coordinated in simultaneous interpretation are facilitated by the fact the interpreter hears the content of the speech for the second time. This ‘recently-acquired familiarity’ with the content, coupled with specific notes the interpreter may have taken, should facilitate management of the extra load that the added coordination and management of operations may bring (e.g. anticipation, re-reading and matching notes from the first hearing, using other functions of the pen, such as slowing down or speeding up the playback).

The Study

The study aimed at comparing interpreting performances delivered in two different modes, namely the “traditional” consecutive mode and the new dual hybrid mode, Consec-simul with notes, whereby the interpreter can perform from their notes as well as from playing back the recorded source speech. It specifically focussed on comparing the interpreting performance of four professional interpreters (working in the English-French pair) on the basis of accuracy, source-target correspondence and fluency.

The study also aimed at measuring the level of communication or interaction interpreters have with their audience when interpreting in one mode or the other. Participants were informed of this aim and were asked to consider the two other people in the room and the camera as their ‘audience’ during their interpretations. This is an important point to underline, since we wanted to see if these interpreters would attempt to improve the ‘lack of eye contact’ aspect previously mentioned. If so, this might suggest that if being told, or even trained, interpreters might be able to ‘control’ what appears as a drawback in the use of such technology, and be more natural and communicative.

The viability of the hybrid mode using the digital pen in the profession was also tested. The focus was therefore put on the interpreters’ perspective about the use of the Consec-simul with notes mode with the Smartpen, in a real-life situation, to determine if they would consider using the tool in their future practice. The full methodology, procedure and results can be found in an article published in 2014 (Orlando 2014) but are summed up hereafter.

Four English-French interpreters (all recent graduates) accepted to undertake the test and to interpret two speeches consecutively, one as a traditional consecutive interpretation, the other in the hybrid mode. The equipment used for the Consec-simul with notes performance was the digital pen Livescribe Smartpen, model Pulse™, and an A5 Livescribe notebook of micro-chipped paper. The experiment was conducted in the English-French pair and the analysis was made on the interpretations of speeches delivered in English and interpreted into French. The texts used for the study comprised speeches that were similar in terms of topic (transatlantic relations), length and density of information. Both speeches had been previously video-recorded from a delivery by the same English native speaker. The objectives of the study were explained to the participants as follows: “Our aim is to test the validity of the use of digital pen technology in the ‘Consec-simul with notes’ mode compared to the conventional consecutive one. Previous comparative studies have shown better accuracy but a lack of eye contact in the hybrid mode of interpreting; therefore, the experiment will also aim at checking the accuracy of the interpretation in both modes and the eye contact instances with your audience”. Before starting the actual experiment, the interpreters were given half an hour to get used to the pen functions, and were also given the opportunity to interpret in Consec-simul from another speech, of similar topic and length. Each video-recorded source speech in English was played without pause to the interpreter who then had to interpret it into French. Interpreters were allowed a 15-minute break between speeches. Interpretations were all video-recorded. After the experiment, participants were asked to stay in the room to fill in a questionnaire about their impressions and feelings.

After the experiment, the features of the interpreted performances (accuracy, eye contact instances, hesitation phenomena) were measured and analysed from the objective factors captured on the video. To measure the performance of the interpreters in terms of accuracy in each mode, each sentence of each speech was chunked in different “units of meaning” (Seleskovitch, 1989), representing facts and ideas which were then aggregated. Each recorded interpretation was then transcribed orthographically (with the hesitations reported) and compared to the source speech, with the different units of meaning counted down, for each interpreter, in each mode of interpreting. The measurement consisted in checking the number of units of meaning understood by the interpreters and rendered fully in their performance. The way the rendition was phrased and its effect on an audience were not measured.

As an indication of how communicative each interpreter was in each mode, each eye contact instance with members of the audience was reported, according to the fact they were short or long, i.e. more or less than 1.5 second. Research in oculesics (the elements of kinesics dedicated to eye-related nonverbal communication) has shown that eye contact instances in a public-speaking situation indicate more or less interest, attention and involvement with the audience (Beebe 1974). Studies on gaze (length of gaze, frequency of glances, patterns of fixation) have indicated that speakers usually assign a more frequent and longer glance to the audience when they know their topic well, and that an increase in the length amount of contact generated by a speaker significantly increases the speaker's credibility (Gu and Badler 2006, Beebe 1974). During the interpreting performance, shorter eye contact occurrences seem to indicate simply the acknowledgement of the communication situation and the interpreter’s awareness of an audience to connect with. Longer eye contact occurrences seem to indicate the interpreter is engaging more deeply with the recipients of the interpretation and is speaking directly to members of the audience.

The measurement of hesitation phenomena was done by counting the number of pauses, hesitations, false starts, etc, for each interpreter in each mode. Measuring these would indicate if there are more occurrences in one mode or another which would affect the fluency of the interpretation. “Disfluencies” as Garnham called them, such as “hesitations, pauses, ums and ahs, corrections, false starts, repetitions, interjections, stuttering and slips of the tongue” (Garnham, 1985, p.206), have an impact on the fluency of the interpretation as they indicate hesitations in understanding the content, in retrieving the meaning of words or symbols noted down, in finding the right syntactical construction in the target production, but also nervous tension on the part of the interpreter. Goffman considers these “linguistically detectable faults” or “influencies” (1981, p.172) as manifestations of the efforts of reasoning and formulation which accompany linguistic production. For him, the skill of professional speakers, such as the university lecturer or the radio announcer, is to control output in such a way as to hide these efforts and any hesitations they may entail. The speaker maintains control of any hesitations which could surface as “linguistically detectable faults”. As indicated by Mead (2000, p.91), “Goffman’s discussion provides an interesting theoretical basis for evaluation of fluency. Given that interpreters can to all intents and purposes be considered professional speakers, the definition of fluency by default (i.e. absence of influencies) can also prove relevant to evaluation of interpreting.”

Finally, to collect participants’ perspective on the mode and the potential use of the technology in professional practice, a questionnaire was distributed at the end of the experiment. It consisted of nine open-ended questions and was presented to participants after their performance in the Consec-simul mode.


The accuracy of interpretations was calculated based on units of meaning being conveyed in the interpretation. The collected data shows that when interpreting in the Consec-simul hybrid mode, the interpreters were more accurate and rendered more source information than in the conventional consecutive mode. This matched and confirmed what previous studies on technology-assisted consecutive interpreting had shown (Lombardi 2003, Vivas 2003, Camayd-Frexas 2005, Hamidi and Pöchhacker 2007, Hiebl 2011).

The interpreters were told that in previous comparative studies, results had shown a lack of contact with the listeners in the technology assisted mode and that the study of their own interaction with the ‘audience’ would be one of the objectives of that test.

Our data showed that interpreters acknowledged the presence of their audience and interacted with their listeners in both modes. As expected given the specificities of traditional consecutive, three interpreters out of four had more eye contact with the audience in that mode, but one of them had actually more eye contact overall in the hybrid mode.

What was interesting to note too was that the differential ratio ‘long consistent eye contact/short eye contact instances’ was lesser in the Consec-simul mode than in traditional consecutive. In contrast to what some of the earlier comparative studies revealed, there is little evidence here of a uniformly lower interaction in the hybrid mode. In fact, all interpreters maintained eye contact with the audience, with a steady number of long instances in the second speech (with one interpreter having twice as many instances of long eye contact with the audience in the second speech).

Considering the above-mentioned research in oculesics (Gu and Badler 2006, Beebe 1974), we can assume that the longer the eye contact, the more engaged interpreters are with the audience, the greater their assuredness in delivery and the deeper their command of the speech must be. If this communicative behaviour in the simultaneous part of the task is linked with the fact that they were alerted to the issue beforehand, this may indicate that with a certain degree of awareness, and even more importantly, with training, interpreters may be perfectly able to stay well connected with their audience and appear natural, even when providing the simultaneous interpretation. The fact that interpreters hear the speech for the second time when interpreting in this mode must also facilitate this capacity to communicate naturally.

In the study, the number of occurrences of hesitations (false starts, unfilled pauses, filled pauses with instances of “ers, ums, ahs”, repetitions, redirections) was noted down and reported in the transcription of each individual performance in each mode. The data collected showed that ‘disfluencies’ are more frequent in the traditional consecutive than in the Consec-simul mode, and for all interpreters. This is not surprising as the effort required in consecutive interpreting to read notes, to retrieve meaning and logical structure of the ST, and to make a decision on the best reformulation, may often lead to more hesitations in the production phase than in the simultaneous mode where the interpreter follows the flow and pace of the speaker. Gile (1995) puts forward the argument that simultaneity [of listening and speaking] can sometimes make semantic and syntactic choices easier for the interpreter.

Based on the observations during the experiment and during the analysis of the video recorded data, as well as on Mead’s aforementioned comments regarding evaluation of interpreting performances (see above), fewer disfluencies are unsurprisingly indicative of better fluency in the delivery/production. This is an important point in the comparison of the two modes because, as Mead (2000, p.90) also points out, “surveys among interpreters and conference participants confirm the importance of fluency as a determinant of quality in interpreting”. And quoting Altman (1994) he also indicates that “fluency […] is the one single aspect of an interpretation which most palpably distinguishes a professional performance from that of a trainee”.

When linked with the data concerning accuracy and the different ratio of instances of long/short eye contact in the hybrid mode, the above-mentioned ideas seem to suggest that during an interpretation in the Consec-simul with notes mode a higher level of accuracy (comprehension and rendition of the source text) may co-occur with greater fluency (less disfluencies) of the delivery and natural communication with the public (more consistent eye contact instances). And this may allow a professional performance and service of a better quality. Should this be backed up by further studies on a larger scale, the impact of the use of digital pen technology on interpreting pedagogy and training could be of wide-ranging importance.


The interpreters in this sample all declared that they felt more confident in the Consec-simul with notes mode, that they sensed they provided a better performance, and that they preferred interpreting in this mode. All also added they would use it in future professional settings, provided they engage in or invest in more (self-) directed training with the digital pen and its features. This hybrid mode of interpreting seems to offer various intriguing possibilities for the profession. As discussed, the advantage is that the interpreter already knows the content of the speech when (s)he starts interpreting, can use the notes (s)he has taken in anticipation or backup, and can also slow down the playback of the recording if necessary. However, even if the simultaneous interpretation phase is facilitated, the difficulty of working in this mode lies in the various tasks to be completed simultaneously: starting the playback, listening and understanding, speaking, reading the notes, and operating the pen if necessary. That is why directed and specific training would be required to perform adequately in this unorthodox mode (Orlando 2015, Setton and Dawrant 2016).

It is important to note that the number of participants tested in this study was small and the small size disallows any claim that their experiences and attitudes are representative of most interpreters. However, the results of this study were promising insofar as the use of digital pen technology in the hybrid mode of interpreting Consec-simul with notes seemed to indicate a better quality of performances and a better comfort in performing. Further research should be encouraged to gather more evidence of this and to motivate training programmes to introduce the technology in their curricula, with the aim of both facilitating the work of interpreters and improving the service to end-users who expect high quality in the performances of professionals in any context where long consecutive interpretation is required.


Beebe, S.A. (1974). Eye contact: A nonverbal determinant of speaker credibility, The Speech Teacher, Vol 23, 1, 21-25.

Boyle, J. (2012). Note-taking and secondary students with learning disabilities: challenges and solutions, Learning Disabilities Research & Practice, vol. 27, no. 2, 90–107.

Camayd-Freixas, E. (2005). A Revolution in Consecutive Interpretation: Digital Voice-Recorder-Assisted CI, The ATA Chronicle 34, 40-46.

Dawson, L & Plummer, V. (2010). Building a system of managing clinical pathways using digital pens, in D Hansen, L Schaper, L & D Rowlands (eds.), HIC 2010, vol. 18, 32-36.

Ferrari, M. (2002). Traditional vs. ‘simultaneous consecutive’, SCIC News 29, p 6-7. Retrieved from http://scic.cec.eu.int/scicnews/2002/020130/default_29.htm

Garnham A. (1985). Psycholinguistics: Central Topics, London-New York, Routledge.

Gile, D. (1995). Basic Concepts and Models for Interpreter and Translator Training. Amsterdam/Philadelphia: John Benjamins.

Goffman E. (1981). Forms of Talk, Philadelphia, University of Pennsylvania Press.

Grieve, C R & McGee-Lennon, M R. (2010). Digitally augmented reminders at home, Poster presenting results of summer work, viewed on 14 November 2014, http://www.multimemohome.org/files/papers/PenPosterPortrait_final.pdf

Gu, E. and Badler, N.I. (2006). Visual attention and eye gaze during multiparty conversations with distractions, Intelligent Virtual Agents, Lecture Notes in Computer Science, Vol 4133, p.193-204.

Hamidi, M., and Pöchhacker, F. (2007). Simultaneous consecutive interpreting: a new technique put to the test, Meta: Translator’s Journal, 52, 2, 276-289.

Hiebl, B. (2011) Simultanes Konsekutivdolmetschen mit dem LivescribeTM EchoTM Smartpen. Masterarbeit, Universität Wien. Retrieved from http://othes.univie.ac.at/1460... 

Lombardi, J. (2003). DRAC Interpreting: Coming Soon To A Courthouse Near You? http://www.najit.org/proteus/PDFVersions/Proteus_Spr03%20web.pdf

Mead, P. (2000). Control of pauses by trainee interpreters in their A and B languages, EUT - Edizioni Università di Trieste. Retrieved from http://hdl.handle.net/10077/2451

Orlando, M. (2014). A study on the amenability of digital pen technology in a hybrid mode of interpreting: Consec-simul with notes, The International Journal of Translation and Interpreting Research, 6, 2, 39-54.

Orlando, M. (2015). Digital pen technology and interpreter training, practice and research: Status and trends in S Ehrlich & J Napier (eds.), Interpreter Education in the Digital Age, Gallaudet University Press, Washington DC, 125-152.

Pöchhacker, F. (2012). “Consecutive 2.0”, 53rd annual conference of the ATA, San Diego, California, 2012.

Seleskovitch, D. and Lederer, M. (1989). Pédagogie raisonnée de l’interprétation, Brussels-Luxembourg, Opoce, Didier.

Setton, R. and Dawrant, A. (2016). Conference interpreting: A complete course, Amsterdam/Philadelphia: John Benjamins.

Vivas, J. (2003). Simultaneous Consecutive: Report on the comparison session of June 11, 2003. SCIC B4/JV D2003, Brussels, European Commission, Joint Interpreting and Service.

Recommended citation format:
Marc ORLANDO. "Testing digital pen technology in a hybrid mode of interpreting". aiic.net. May 2, 2017. Accessed May 27, 2017. <http://aiic.net/p/7944>.