Status: 12 November 2023
This collection is close to complete. Additional presentations are being added in the coming days
Any author who does not find their presentation or requires changes, please let us know at <firstname.lastname@example.org>.
We Believe in Humans
Isabelle will dive into how people interact with machines today and how this will evolve. She believes humans always have been and will continue to be key in this relationship. Looking ahead to a future where we meet the singularity, she sees our real challenges not in AI itself but in the opportunities AI creates. With the localization industry at the heart of these changes, Isabelle will inspire us to embrace the new possibilities ahead.
Translated co-founder | Pi School CEO
Isabelle Andrieu is an entrepreneur and serial investor in startups and a mother of 3 children. Along with Marco Trombetti, in 1999 she co-founded Translated, a translation service provider that pioneered the use of artificial intelligence to help professional translators.
Today Translated has over 270,000 customers, including companies and institutions such as Airbnb, Google, IBM, and the European Commission. It has created language technologies utilized by millions of users every month, such as MyMemory and MateCat. At Translated, Isabelle now serves as chairwoman and is involved in localization operations and training. She loves speaking about people empowerment and leadership, especially for women.
Isabelle is also co-founder and first citizen of Pi Campus, a venture capital firm investing in early-stage technology startups, mostly in the artificial intelligence field, with more than 50 investments between Europe and the US in its portfolio. Some of the companies funded by Pi Campus are based in a startup district located in Rome, where luxury villas have been converted into offices to provide the best work environment for talent. Isabelle takes care of the workplace’s improvement, and she is the point of reference for startups that need to hire talent.
Since October 2019, she serves as CEO at Pi School, the educational branch of Pi Campus. Pi School is an innovative school that is creating an A-level class of engineers by providing hands-on training by people with high-profile experiences reserved to a small number of attendees selected from all over the world based on their talent.
Ethical challenges in Responsible AI: Responsible Machine Translation and the impact of Large Language Models
Responsible AI, Green AI, AI for Social Good, Fair AI, Ethically Aligned Design are all terms encompassed under a generic umbrella usually pertained to Ethics and AI.
This topic has been very disruptive, but still fairly broad and with substantial concrete actions needed, since it is known that the political power and industry will be heavily impacted by the legislation and market rules of Responsible (in the sense of accountable) AI at present and in the near future.
The realms of the impact of Responsible AI are not just confined to the technical, research, and industry applications, they are also implicated in everyday’s actions of each citizen and affects all areas of society.
This talk will introduce the core concepts and pillars of Responsible AI, emphasise the main initiatives created, and discuss, based on concrete examples, the ethical concerns around several AI applications. We will zoom in and have a closer look on Machine Translation (MT), to tune our contextualised view on Responsible AI and apply it to the scope of multimodal, multilingual and multicultural MT. The talk will also cover examples of research projects in Responsible AI and initiatives to create world-wide Centres for Responsible AI, aligned with the Sustainable Developmental Goals of the United Nations, and their core challenges to embrace industry differentiation with Responsible AI and human benefit.
President, EAMT | Vice President, IAMT | Assistant Professor, University of Lisbon
Helena Moniz is the President of the European Association for Machine Translation and Vice President of the International Association for Machine Translation. She is also the Vice-Coordinator of the Human Language Technologies Lab at INESC-ID, Lisbon. Helena is an Assistant Professor at the School of Arts and Humanities at the University of Lisbon, where she teaches Computational Linguistics, Computer Assisted Translation, and Machine Translation Systems and Post-editing. She graduated in Modern Languages and Literature at School of Arts and Humanities, University of Lisbon (FLUL), in 1998. She received a PhD in Linguistics at FLUL in cooperation with the Technical University of Lisbon (IST), in 2013. She has been working at INESC-ID/CLUL since 2000, in several national and international projects involving multidisciplinary teams of linguists and speech processing engineers. Within these fruitful collaborations, she participated in 16 national and international projects. Since 2015, she is also the PI of a bilateral project with INESC-ID/Unbabel, a translation company combining AI + post-editing, working on scalable Linguistic Quality Assurance processes for crowdsourcing. She was responsible for the creation of the Linguistic Quality Assurance processes developed at Unbabel for Linguistic Annotation and Editors’ Evaluation. She now is working mostly on research projects involving Linguistics, Translation, and Responsible AI. In a sentence, she is passionate about Language Technologies!
Advanced pre-processing for the translation of multilingual documents in XML at the European Parliament
Multilingualism is one of the European Parliament’s core values and missions. The Directorate-General for Translation (DG TRAD) plays a significant role in the eLegislate project, which aims to improve the production of documents, from authoring to publication, by reducing manual operations through automation. It is based on XML4EP, a flavour of AKN4UN. Developing and adapting the IT tools and workflows to XML4EP has been a challenging transition for translation support units. DG TRAD has opted for a mixed approach: buy, build and customise. It uses Trados Studio, with bespoke and/or customised plugins, and a series of custom-built services and applications, including a computer-assisted translation tool, to ensure the translation of documents included in the eLegislate chain. Multilingual source texts pose a challenge for language professionals and IT teams alike. This presentation details the challenges arising from dealing with multilingual texts in XML and offers an overview of the IT landscape developed to pre-process them. Based on the XML mark-up and the metadata accompanying translation requests, it provides the intercultural language professionals in DG TRAD with the tools to translate them in an efficient and user-friendly way.
Paula Vlaic has a translator’s degree from the Applied Modern Languages Section of Babeș-Bolyai University, in Romania, where she studied English and French and followed the courses for translators specialised in business and international relations. She also holds a Master’s degree in European Legal Studies from the Université de Lorraine and the European Institute of Public Administration. She started working on the translation of the acquis in view of Romania’s accession to the EU in 2002. She joined the European Parliament in 2005, with the pre-accession team, and was a member of the Romanian Translation Unit for 11 years, first as a translator and then as a quality coordinator. In 2017, she became the head of the Euramis-Pre-Translation Unit. In 2019, she moved to the Applications and IT Systems Development Unit, leading the team that ensures the development, maintenance and customisation of the IT tools that support the workflows and translation processes in DG Translation at the European Parliament.
[Bio text to follow]
Automation, digitalisation and the role of the translator at the European Parliament: prospects, challenges and job perceptions
Vilelmini Sosoni, PhD, is Associate Professor at the Department of Foreign Languages, Translation and Interpreting at the Ionian University in Greece. She has taught Specialised Translation in the United Kingdom at the University of Surrey, the University of Westminster and Roehampton University, and in Greece at the National and Kapodistrian University of Athens, Metropolitan College and the Institut Français d’Athènes. She also has extensive professional experience having worked as a professional translator, editor and subtitler. She studied English Language and Literature at the National and Kapodistrian University of Athens and holds an MA in Translation and a PhD in Translation and Text Linguistics from the University of Surrey. Her research interests lie in the areas of the Translation of Institutional and Political Texts, Corpus Linguistics, Audiovisual Translation and Accessibility, as well as Machine Translation and Cognitive Science. She is, among others, a founding member of the Laboratory “Language and Politics” of the Ionian University and the Greek Chapter of Women in Localization and a member of the “Research Centre for Translation and Intercultural Studies” of the University of Roehampton. She is also a member of the Advisory Board and the Management Board of the European Master’s in Technology for Translation and Interpreting (EM TTI) funded by Erasmus+. She has participated in several EU-funded projects, notably Resonant, Trumpet, TraMOOC, Eurolect Observatory and Training Action for Legal Practitioners: Linguistic Skills and Translation in EU Competition Law, while she has edited several volumes and books on translation and published numerous articles in international journals and collective volumes.
Chrysanthemi Michali holds a BA and an MA in Translation from the Department of Foreign Languages, Translation and Interpreting of the Ionian University in Greece. She works professionally as a technical translator and subtitler as well as Lead Project Manager at MTLT (MedicoTechnical Localization and Translation). She is a native speaker of Greek and her working languages are English, French and German. She is a member of the Panhellenic Association of Professional Translators Graduates of the Ionian University (PEEMPIP) and currently serving as the Social Media Manager for the Women in Localization – Greece chapter. Furthermore, she is a trained audio describer and subtitler for the D/deaf and the hard of hearing.
Michail Panagopoulos, PhD, is Associate Professor at the Department of Audio and Visual Arts of the Ionian University in Greece. He received a BEng in Electrical and Computing Engineering from the National Technical University of Athens (NTUA). He holds a PhD in pattern recognition and image processing in archaeology and arts. His research is broadly concerned with artificial intelligence, machine learning, pattern recognition and image processing applied with application in the arts, cultural heritage, and audiovisual technology. He has taught undergraduate and graduate courses including Artificial Intelligence (AI), Mathematics and Art, 3D Graphics, Digital Synthesis of Virtual Environments, Audiovisual Systems for Alternative Reality, Mathematics for Audiovisual Technology. He has published extensively in the areas of AI and Audiovisual Technology and he has participated in several national and EU-funded research projects both as coordinator and as researcher.
Beyond Compliance: Making Translation Software Accessible
There are widely accepted standards such as WCAG (Web Content Accessibility Guidelines) that can be implemented to help the visually impaired, and screen readers are commonly used assistive technologies. Dictation and narration tools allow you to enter your text and to listen to text. But these only resolve a portion of the issues faced, as “compliant” is not the same as “usable”.
We are lucky to work with a blind Ph.D. candidate with the aim of making a standard translation environment fully usable for blind and visually impaired users. The challenges she faces go beyond what we anticipated, and this was an enlightening exercise for our product teams.
This presentation shows the power of academia working with a software vendor to make working life feasible for blind translators.
Daniel Brockmann is Principal Product Manager at Team Trados. Daniel product-manages a wide range of applications in the Trados portfolio designed for translators, project managers and terminologists – from the translation productivity environment Trados Studio via the terminology management suite MultiTerm to the software localization solution Passolo. He also helps the team evolve the translation management solution Trados Enterprise as well as the collaboration solutions Trados GroupShare (on-premise) and Trados Team (cloud-based).
Bridging the gap: Exploring the cognitive impact of Interpretbank on Chinese interpreting trainees
I am a Ph.D. student at the University of Bologna. Under the supervision of Professor Ricardo Munoz Martin and Victoria LEI Lai Cheng, my research focuses on the area of terminology management from a cognitive perspective. More specifically, I am engaged in an in-depth investigation of Chinese interpreter trainees’ information-seeking behavior during the preparation and delivery of simultaneous interpretation (SI). The primary objective of my research project is to explore the impact of advanced terminology tools versus traditional methods on the quality of simultaneous interpretation. To achieve this, I am using a multimodal synthesis of Python-based keystroke event logging, screen recording, and output recording techniques. These methods allow for a comprehensive examination and evaluation of the aforementioned tools and their influence on the overall efficiency of the interpreting process.
Building High Capacity Teacher Models on HPC Infrastructures for the eTranslation Service
eTranslation, the European Commission’s machine translation (MT) service, provides neural machine translation between all 26 official languages of the EU and the EEA. It leverages the European Institutions’ high-quality internal translation database, as well as uses additional parallel data from external sources. MT services even in this medium scale require substantial computational power and a continuous search for the right balance between the use of the available resources and the best possible performance of the models. We aim to give an answer to the resource-performance dilemma with the method of knowledge distillation: we build high capacity, complex teacher MT models using supercomputer infrastructure, and use their output to produce cost-effective, fast, production-ready bilingual and multilingual student models. Evaluation results for teacher models show a significant 10% improvement over baseline models, whereas the student models also demonstrate a slight improvement over the baseline current eTranslation models, while being smaller, significantly faster and more efficient to deploy. We open source the models to support the MT community with high quality MT services in the EU formal language domain.
Mr. Oravecz was a founding member of the Department of Language Technology at the Research Institute for Lingustics, Hungarian Academy of Sciences and has been active in the field of computational lingusitics and natural language processing ever since the start of his professional career. He graduated as an electrical engineer at the Technical University of Budapest, and also holds a degree in theoretical linguistics and English.
He was one of the chief developers of the Hungarian National Corpus and worked as principal researcher and project manager in several projects focusing on the development of language resources and natural language processing applications, on the development and deployment of complex processing environments for textual data and the automatic linguistic analysis of natural language text using machine learning methods.
He was the editor and (co-)author of project deliverables and research papers, teacher in several university courses on natural language processing (NLP), machine learning, statistics and programming languages until his leave in 2015, when he joined the eTranslation project at the Directorate-General for Translation, EC.
His main focus in the eTranslation project has been the application of deep learning algorithms and neural networks, and the development of high-quality machine translation systems with special emphasis on challenging languages.
ChatGPT Translator Plus
I have spent most of my working life – the last 55 years – in the “language business”. After working as a translator and lexicographer I ran a translation business for some 12 years during the course of which my interest in machine translation arose. I taught myself Java at the age of 50 and then went on to write a Dutch-English rule based machine translation program used to translate technical documentation for Siemens Nederland on its major infrastructure projects in the Netherlands.
For the past 6 years I have been working in the field of Neural Machine Translation developing custom NMT models for industrial and business clients. I have now adopted Python as my main programming language. In view of the proliferation of cloud based vendors in the major language pairs I have decided to focus of low resource languages.
Since the advent of Large Language Models I have switched to making small apps like a completely desktop-based version of Facebook Research’s NLLB framework and ChatGPT Translator Plus which provides an easy-to-use gateway to OpenAI’s powerful models.
ChatGPT vs. DeepL: Artificial Intelligence compared to the self-proclaimed world’s most accurate translator via human evaluation of coreference translations from English into French
The resolution of coreference links is a Natural Language Processing (NLP) task consisting in detecting, in a text, linguistic expressions (mentions) that refer to the same entity in the real or imaginary world. A coreference is a link between two or more mentions referring to the same entity. Anaphora is the relationship between two mentions, one of which refers to the other. Coreference or anaphora resolution impacts how those linguistic phenomena are translated, above all when the target language, French, is a more gendered language than the source one, English. We present the quality evaluation of coreference translations made by ChatGPT and DeepL. We selected 156 mentions of the ParCorFull 2.0 aligned English and French TED, annotated in coreferences. Each segment is made of the sentence where the entity occurs and the sentences in which the mentions occur. We then translated them using ChatGPT and DeepL, obtaining 156 translated segments for each. We used the ACCOLÉ translation-error annotation platform to annotate the quality of the coreference translation, focusing on pronominal anaphora to study gender bias. As the study is still undergoing, we cannot provide the results yet even though we can already notice that ChaptGPT better translates anaphora than DeepL.
After studying technical writing and translation, Emmanuelle Esperança-Rodier graduated with a PhD in computational linguistics on the quality of writing in Simplified English (AECMA). Having worked as a project manager and translator in a translation agency for several years, she joined the Grenoble Alps University, Grenoble Computer Science Laboratory (LIG) where she is studying Skill-Based and Task-Focused Machine Translation Evaluation.
In her spare time, if any, she enjoys reading, yoga, crafts, music and long walks with her dog and her family.
After a Bachelor’s degree in English, where she specialised in translation, Sophie is about to graduate with a Master’s degree in Natural Language Processing. Currently working on the evaluation of the quality of machine translation for her internship with the Grenoble Computer Science Laboratory, she plans on finding a way to incorporate her passion in her job by working on the localization of video games.
Correcting biased translations with the Fairslator API
This short talk will introduce the Fairslator API, a software solution for gender rewriting and form-of-address rewriting of translations. The talk will start with a review of bias (including but not limited to gender-bias) in machine translation and will introduce the concept of rewriting as a method for solving the problem. We will demonstrate how the Fairslator API can be used to rewrite biased translations into alternative genders or forms of address, and we will survey the ways in which rewriting can be integrated into the translation workflow, for example as a step in machine translation post-editing. The talk will conclude with an evaluation of the API’s performance measured against two open-source benchmarks.
Michal is the person behind Fairslator, a tool for removing bias from machine translation. He works as a freelance language technology consultant and, besides Fairslator, is also the author of the open-source dictionary writing system Lexonomy and the open-source terminology management platform Terminologue. Michal has previously worked for
Dublin City University, for Foras na Gaeilge in Dublin, and for Microsoft Ireland. He is currently based in Brno, Czech Republic.
Educating the next generation of translators in the age of AI
Claims of technology replacing translators have most recently been brought up with the emergence of tools powered by Large Language Models such as OpenAI’s ChatGPT. However, such claims are not new: parallels can be seen with earlier discussions related to (neural) machine translation, for example. Even if machines are not replacing translators, it is undeniable that technology is changing the landscape of the language industry and the role of translation professionals. The future of translators and the competences needed to work with such emerging technologies is therefore of significant interest to academics, practitioners and professional associations. In this presentation, we discuss the work that is ongoing in the International Federation of Translators (FIT) Standing Committee on Technology to address the impact of technology on the role and education of the next generation of translators. We reflect on what skills and competences translators will need to provide clients with solutions that meet the specifications and how translator education can provide them with the competences needed in the changing technological landscape.
School of Humanities/Foreign languages and translation studies, University of Eastern Finland, Finland Maarit Koponen currently works as Professor of Translation Studies at the University of Eastern Finland. Her teaching and research activities focus on the use of machine translation and other translation technologies, machine translation post-editing and quality evaluation. She is a member of the International Federation of Translators (FIT) Standing Committee on Technology as a representative of the Finnish Association of Translators and Interpreters SKTL. As part of the EU-funded COST Action “Language in the Human–Machine Era” (LITHME), she chairs the working group on “Language work, language professionals”. She has also worked as a professional translator for several years.
Alan K. Melby
Raised in Indiana. Identifies as a Hoosier. Became fascinated with translation in the mid-1960s, while on study abroad in St Brieuc, France. In 1978, after obtaining a PhD in computational linguistics, experienced an intellectual crisis regarding the nature of language, concluding that unambiguous general language would be the ultimate prison, but domain-specific language can and should be unambiguous. In 1979, shifted focus toward tools for human translators. In the 1980s, became an ATA-certified French-to-English translator. In the 1990s, got into philosophy of language and wrote a book about human and machine translation (The possibility of Language) with a philosopher, Terry Warner. In the 21st century, has focused on service to the translation profession, previously serving on the governing boards of ATA (www.antanet.org), then FIT (www.fit-ift.org), and currently (2023) serving as president of LTAC Global, a small non-profit, chair of the FIT North America regional center, and collaborator on the development of translation-related standards. In 2014, retired from full-time teaching and became an emeritus full professor. Since 2015, standards work has expanded to translation quality evaluation within the MQM framework (Multidimensional Quality Metrics for Translation Quality Evaluation) under the umbrella of ASTM International (www.astm.org), a standards body.
Translation Department, College of Languages, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia Amina Tahraoui is an Assistant Professor at Princess Nourah University. She teaches Translation technology, Audiovisual Translation, and other specialized translation courses. Over the years, she acquired experience in academic quality assurance and is currently leading a research project on using electronic rubrics for training translators. She is a member of the New England Translators Association (NETA) and serves as a member of the Technology Committee of the International Federation of Translators. She has published several papers in international journals. Her research interests include Translation and Interpretation training, Cognitive linguistics, and Neurolinguistics. Currently, her great focus is on translation and artificial intelligence, subtitling quality assessment and accessibility, and social justice and inclusion.
Miguel Ángel Candel-Mora
Envisioning the Post-Editor’s Workstation:
a Backward Glance and a Glimpse Into the Future
Marie Escribe is a PhD student in Translation Technology at the Universitat Politècnica de València and a Linguistic Engineer at LanguageWire. She has over five years of experience as a translator, and holds an MA in Translation from London Metropolitan University and an MA in Computational Linguistics from the University of Wolverhampton. Her research interests revolve around translation technologies and include in particular post-editing, computer-assisted translation tools, translation memory systems and translation quality evaluation.
Miguel Ángel Candel-Mora
Miguel Ángel Candel-Mora is professor in the Department of Applied Linguistics of Universitat Politècnica de València, Spain, where he teaches Translation-oriented terminology management and Language technologies for translation in the Master’s Degree in Languages and Technology. He has been actively involved in the translation and localization industry for more than 25 years. His academic interests focus on intercultural communication, specialized languages, language technologies for translation, and translation-oriented terminology management. Currently, he is the director of the Master’s Degree in Languages and Technology of the UPV.
EU interinstitutional data principles and IATE data guidelines
In 2022, interinstitutional data principles were adopted as a step towards an active and common data management policy for EU linguistic tools, among them IATE. The established principles are the following: quality, transparency and proper annotation, completeness, data sharing and data reuse, and lifecycle management. We will present the main mechanisms available in IATE to achieve these data principles, as compiled in the recently drafted IATE data guidelines.
In the private sector, John Kirby worked for several years as a translator, reviser, interpreter and technical writer. He also worked on machine translation systems at both Siemens and Langenscheidt. After moving to the public sector, he spent 15 years as a translator and reviser at the European Commission before moving in 2015 to the Commission Terminology Coordination unit, where his areas of responsibility include terminology project coordination, training and communication.
Forecasting translation needs: Practices at the European Parliament’s DG for Translation
The European Parliament’s Directorate-General for Translation (DG TRAD) is a global leader in the field of translation. In 2022 alone, DG TRAD translated 2.8 million pages into all the official EU languages. To service such vast translation needs, efficient translation management and reliable workload forecasting are of paramount importance.
DG TRAD’s Planning Unit is the entry point for all translation requests within the European Parliament. In this capacity, the unit works closely with all the services that require translation, particularly those involved in the preparation of future EU laws. This close cooperation makes it possible for the unit to assess current and future translation needs arising from active files, and in this way draw up a forecast of the expected workload.
However, the transformation of our theoretical knowledge of expected translation needs into a quantifiable number of pages requires a specific tool and a specific method for ‘translating’ data into charts.
The presentation will describe how forecasts are prepared, how data is collected and interpreted, and what tools and calculation methods are used. It will touch upon the limitations of the forecasts prepared, their reliability, and possible future developments in the field.
Simona Križaj Pochat
With her 20-year experience of working for the European Parliament’s DirectorateGeneral for Translation (DG TRAD), Simona Križaj Pochat has evolved from the role of translator and quality coordinator to that of Head of the DG TRAD’s pivotal Planning Unit, whose task is to coordinate the translation of all legislative and administrative documents at the European Parliament.
Planning the (frequently) unplannable is always a challenge, but is all the more so when working in a very large translation service with a constantly growing translation demand. In this edition of the Translating and the Computer conference, Simona will be co-presenting a very important element in the European Parliament’s translation management strategy: the forecasting of upcoming translation needs and workload.
Hristina Stoimenova-Nenkova began her career at the European Commission’s Delegation to Bulgaria in 2004. Three years later, driven by her deep-rooted passion for languages and intercultural communication, she joined the European Parliament’s Directorate-General for Translation (DG TRAD). Throughout her professional journey,
Hristina has gained invaluable experience by taking on a range of different roles in the translation services, including those of translator, work allocator, and client liaison officer.
Since 2017, Hristina has been coordinating the preparation of the regular translation workload forecast issued by DG TRAD’s Planning Unit, and is responsible for overseeing the development of the workflow tool that is used as a basis for the forecast.
In this edition of the Translating and the Computer conference, Hristina will be copresenting the topic of DG TRAD’s efforts to forecast upcoming translation needs and workload – a very important element in the European Parliament’s translation management strategy
From shifting thoughts to unlocking knowledge: The power of terminology in the digital era
Lucy Walhain holds a Master’s Degree in Translation from the Université catholique de Louvain-la-Neuve (UCLouvain). During her master, she specialised in terminology management and localisation.
Since joining the Publications Office of the European Union, she contributes to the definition and implementation of standards and interoperability solutions in the metadata domain within the Publications Office and on the interinstitutional level and to the maintenance of controlled vocabularies such as EuroVoc.
Mihai Paunescu is a semantic data consultant working with the Publications Office of the European Union. Active in the area of reference data management, structures and linked data he provides technical support for the ingestion, maintenance and dissemination of datasets published on the EU Vocabularies website and on the ShowVoc platform. He is particularly involved in linked data initiatives supporting the team in actions involving alignment, dissemination and support services.
Denis Dechandon is an experienced tool and business manager with a demonstrated history of working in the government administration industry. His studies in Romance languages at the University of Stuttgart, along with his professional activities, have centred around multilingualism, linguistics, translation, natural language processing, and foreign languages. Additionally, Denis has focused on semantic technologies, interoperability, knowledge organisation system creation and maintenance, and process automation.
With a track record spanning 30 years, he has dedicated some 20 years to providing linguistic services within the framework of EU institutions. Over the last decade, Denis has shifted his focus to the cutting-edge field of semantic technologies, further enriching his expertise.
Denis’ contributions to the field of language technology and translation have been recognised through his active participation in various conferences, including those centred around knowledge management. He is also deeply engaged in organising Translating and the Computer annual conferences, JIAMCATT annual and local meetings, and ENDORSE conferences and follow-up events.
Presently, Denis is engaged in activities that focus on enhancing semantic interoperability and promoting the widespread use of semantic technologies. His core mission revolves around supporting government administrations and national public services in creating, maintaining, enhancing, and disseminating semantic assets and tools. The ultimate goal is to foster increased data flows, seamless data sharing and reuse, and the promotion of further developments in the field of Linguistic Linked Open Data.
Throughout his professional journey, Denis has maintained a modest yet confident approach, consistently striving for excellence in all his endeavours. His comprehensive understanding of the language services sector and his growing expertise in the realm of semantic technologies make him a very knowledgeable professional.
Carolina Dunaevsky is a versatile language expert and technology specialist, holding a Diploma in Translation from the National University of Cordoba (Argentina) and a Master of Arts in Terminology and Language Technologies from the Technical University of Cologne (Germany).
She currently works as a Terminologist and Language Technology Specialist at the Court of Justice of the European Union (CJEU) in Luxembourg, where she excels in terminology management and linguistic analysis. Carolina’s achievements include successfully migrating CJEU’s terminological database to the InterActive Terminology for Europe (IATE) platform and actively participating in EU interinstitutional meetings related to terminology and language technologies. She is multilingual, proficient in German, English, French, and Spanish, and has experience as a translator. Additionally, Carolina has served as a visiting lecturer at the University of Luxembourg, contributing to the course “Translation and EU Terminology.” Her combination of language skills and tech expertise is a valuable asset to both the CJEU and the wider linguistic community.
Gender-Fair Language in Machine Translation: Insights into Bias and Post-Editing Effort
Google Translate Error Analysis for Mental Health Information: Evaluating Accuracy, Comprehensibility, and Implications for Multilingual Healthcare Communication
Jaleh Delfani holds a PhD in Translation and Interpreting Studies from the University of Surrey and an MSc in Language Technologies. She obtained her MSc degree in Language Technologies from Bangor University, where she gained a comprehensive understanding of the theoretical and practical aspects of natural language processing, computational linguistics, and natural language understanding. During her studies, she delved deeper into the intricacies of translation and interpreting, examining the impact of technology on these domains.
Currently, she serves as a research fellow in Language Technologies at the School of Literature and Languages, Centre for Translation Studies, Surrey University. In this role, she actively contributes to the
advancement of Language Technologies, with a specific focus on the development of Natural Language Processing (NLP) applications in low-resource languages. Recognising the importance of preserving
linguistic diversity, she places a special emphasis on Persian and leverages her expertise to address the challenges associated with developing NLP tools and resources for this language.
Jaleh’s research portfolio extends beyond Language Technologies and encompasses a wide range of topics. She is deeply interested in the fields of translation and multimodal technologies. Within the realm of translation, she explores the impact of technology on translation processes and the development of machine translation systems.
Constantin Orasan is a Professor of Language and Translation Technologies at the Centre of Translation Studies, University of Surrey, UK and a Fellow of the Surrey Institute for People-Centred Artificial Intelligence. Before starting this role, he was a Reader in Computational Linguistics at the University of Wolverhampton, UK, and the deputy head of the Research Group in Computational Linguistics at the same university. He has over 25 years of experience in the fields of Natural Language Processing (NLP), Translation Technologies, Artificial Intelligence and Machine Learning for language processing. His recent research focuses on the use of Generative AI as a support tool for translators. His research is well known in these fields as a result of over 130 peer-reviewed articles in journals, books and international conferences. More information about him can be found at https://dinel.org.uk/
Dr Özlem Temizöz is a postdoctoral researcher at the Centre for Translation Studies (CTS), School of Literature and Languages, University of Surrey, UK. Her research interests revolve around language technologies and their impact on the translation process, product, and the translator. Having worked on projects exploring postediting workflows and directionality in the translation process, she is currently engaged in research projects investigating technology-supported collaborative translation and multilingual communication in healthcare settings.
Eleanor Taylor-Stilgoe is a postgraduate researcher in the Centre for Translation Studies at the University of Surrey. She completed her BA Hons in Hispanic Studies at King’s College London in 2007 and her MA in Spanish to English Translation at Surrey in 2009. She worked as a professional freelance translator for just over 8 years before returning to the University of Surrey under an Expanding Excellence in England PhD studentship in 2020. Having worked in UK public and private healthcare administration for several years whilst freelancing, Eleanor is currently applying her knowledge and experience of translation and healthcare to researching risk in the use of machine translation (MT) in healthcare settings. She recently presented a working paper on this topic at the HIT-IT 2023 conference in Naples, for which she also served on the Programme Committee. She has also previously worked as an Associate Tutor in Translation from Spanish to English at the University of Surrey, where she continues to serve as the Course Leader for A1 Spanish evening classes.
Hadeel Saadany is a Postdoctoral Research Fellow in Natural Language Processing at the Centre of Translation Studies, University of Surrey. She is currently working as the main researcher for a UK Innovate project for facilitating legal information acquisition via Natural Language Processing tools. She is also working on an e-commerce project with collaboration with eBay Inc to improve their products’ automatic search tools. Before embarking on the field of Natural Language Processing, Dr Saadany was an Applied Linguistics Lecturer. Her research interests are mainly focused on the use of data and Natural Language Processing technology for several domains such as Legal AI, Machine Translation for health purposes and information retrieval in e-commerce.
Diptesh Kanojia is a Lecturer at the Surrey Institute for People-Centred AI, University of Surrey, working on research problems in Natural Language Processing (NLP). His research focuses on Quality Estimation and Automatic Post-editing for Machine Translation. He was awarded a joint PhD from the Indian Institute of Technology Bombay, India and Monash University, Australia, in 2021 for his thesis on detecting cognates across more than ten Indian languages. This developed his expertise in working with Indian languages across NLP areas, such as aggression and hate on social media, and information extraction for named entities, abbreviations, and acronyms. He is leading industry-sponsored research projects with eBay Inc. and another project sponsored by the European Association for Machine Translation.
Sabine Braun is Professor of Translation Studies, Director of the Centre for Translation Studies at the University of Surrey, and a Co-Director of Surrey’s Institute for People-Centred Artificial Intelligence. Her research explores human-machine interaction and integration in translation and interpreting, especially to improve access to critical information, media content and vital public services. She is currently leading Workpackage 3 ‘Barriers, needs and communication strategies’ of the European-funded Mental Health 4 All project.
Dr. Barbara Schouten is an Associate Professor at the University of Amsterdam’s Faculty of Social and Behavioural Sciences, department of Communication Science and the Center for Urban Mental Health. Her research interests include the role of language-related and culture-related factors in explaining communication difficulties between healthcare providers and ethnic minority patients with low language proficiency in the host country’s dominant language(s). In addition, she focuses on the use of technology in interventions to mitigate language and culture-related barriers in communication.
The growth of Large Language Models (LLMs) and their practical applications in translation
The advent of Large Language Models started around the same time as NMT systems but were designed initially for a different purpose: the initial idea behind LLM systems was to create a language model that could understand natural language and be used for a variety of tasks, such as question answering, sentiment analysis, and text classification.
What does this mean for the future of the Localization Industry? We can see a direct transition from NMT to further and more extensive adoption of MT. The improvement in quality, the constantly improving output, the ease of creating custom models and the ability to cope with very long sentences all remove previous barriers to wider adoption of NMT. Human translation is rarely perfect 100% of the time. Human translators make errors, both factual and grammatical when translating so LLM based MT offers the capability to improve translation quality in general.
Roberts Ervīns Ziediņš
Hierarchical Data Linkage in a Terminology Management System: Challenges and Solutions at Bioleksipēdija
Since 2021, a collaborative team of terminologists, translators, researchers, and information system developers have worked together to develop a new open-access interactive multifunctional information management system Bioleksipēdija. The system is developed for special lexis data storage and a wide range of statistical and search options designed for language research purposes and comparative multilingual studies in linguistics. Developed system is a terminology management tool and it will be published in December under the domain bioleksipedija.lv. Within this article, one module from the developed system will be described, known as the hierarchical data linkage module. This module links scientific or Latin names of organisms within a systematic tree structure while incorporating vernacular names of organisms also linked to publications. The developed module allows translators to search for and analyze precise terminology, considering both taxon placement within the systematic tree and its frequency of use, as measured by mentions in real publications. System is used for data collection and on 12.07.2023, overall, 65924 scientific and 84596 vernacular names of organisms, 1657 names of diseases caused by organisms, 3013 dictionary words, 403 terms and 562622 linkages in 8533 bibliography units are stored.
Mg.sc.comp. Karina Šķirmante has served as a lecturer since 2013 and embarked on her research journey in 2017 at Ventspils University of Applied Sciences in Latvia. Her primary research focus lies in the realm of information technology solutions for practical applications. This encompasses the development of data processing methods and information systems tailored for specific use cases in both astronomical and linguistic domains. In her role as an educator, Karina imparts her knowledge by offering a variety of programming-related courses, including Software Development, JAVA Programming, and Data Structures and Algorithms. Her dedication to equipping her students with these essential skills greatly contributes to the next generation of IT professionals’ readiness for the industry. Furthermore, Karina Šķirmante is actively pursuing a PhD at the University of Latvia within the Faculty of Physics and Mathematics. This demonstrates her commitment to advancing her expertise and her dedication to making significant contributions to the fields of IT and applied sciences
Dr. Silga Sviķe
Silga Sviķe is an assistant professor and researcher at Ventspils University of Applied Sciences in Latvia. In 2016, she was awarded a PhD in Applied Linguistics from Liepaja University & Ventspils University of Applied Sciences (Promotional Paper “Special Lexis in general Bilingual Dictionaries: Plant Names”). In Ventspils University of Applied Sciences Dr. Silga Sviķe teaches Terminology and Lexicography, Contract Translation Classes, Translation of Technical Texts, German as a Second Foreigh Language in the Bachelor Programme and Translation of Commercial Documents Classes in the Masters Programme. Dr. Silga Sviķe’s research topics are Terminology, Translation, Lexicography and Terminography. She is currently project manager for „Smart complex of information systems of specialized biology lexis for the research and preservation of linguistic diversity”.
Dr. Arturs Stalažs
Dr. agr., leading researcher, Institute of Horticulture, Latvia
Research interests: Biological terminology, generally organism naming; daily research topics
also plant pests and invasive organisms.
Address, Graudu iela 1, Ceriņi, Krimūnu pagasts, Dobeles novads, LV-3701, Latvia,
Orcid iD 0000-0002-3971-6255
Gints Jasmonts has been working for Ventspils International Radio Astronomy Centre since 2019 as a programmer/researcher and as a guest lecturer for Ventspils University of Applied Sciences. His research involves creation of IT solutions for various issues in radioastronomy and linguistics fields, with focus on information system development and data digitization in the latter.
Roberts Ervīns Ziediņš
Roberts Ervīns Ziediņš is a qualified developer from Ventspils University of Applied Sciences, specializing primarily in the Spring Framework.
Carlos Manuel Hidalgo-Ternero
How can Paidom improve the neural machine translation of idioms?
We will present research results with Paidiom, a text-preprocessing algorithm designed for
1) converting discontinuous multiword expressions (MWEs) into their continuous forms and
2) translemmatising them, i.e., converting source-text MWEs into their target-text equivalents, in order to improve the performance of current neural machine translation (NMT) systems. To test its effectiveness, an experiment with the NMT systems of VIP, Google Translate and DeepL has been carried out in the ES>EN translation direction with Verb-Noun Idiomatic Constructions (VNICs) in Spanish. The performance of Paidiom was compared to both the one of our previous algorithm (gApp) and to the manual conversion (our gold standard). In this regard, the promising results yielded by this study, the first one analysing Paidiom’s performance, will shed some light on new avenues for enhancing MWE-aware NMT systems.
Carlos Manuel Hidalgo-Ternero
Dr. Carlos Manuel Hidalgo Ternero holds a Bachelor’s degree in Translation and Interpreting (University of Malaga, Spain) as well as two Master’s degrees: one in Teaching Spanish as a Foreign Language and Other Modern Languages (University of Seville, Spain) and a second one in Secondary Education and Language Teaching (English) (University of Malaga). He also holds a PhD in Linguistics, Literature and Translation from the University of Malaga. He is currently a post-doc researcher at Université catholique de Louvain (Belgium) and at University of Malaga.
Furthermore, he is the author of more than 20 scientific contributions published in English and Spanish in both national and foreign journals, as well as in the form of book chapters in collective works published by SPI first-quartile publishers. In addition, he is the co-chair of Just say IT: International Workshop on Interpreting Technologies (SAY-IT 2023).
His two main lines of research are corpus linguistics and computational phraseology applied to translation in which, more specifically, he has specialised in the preprocessing of idioms to improve current neural machine translation systems. As a result of this research, he designed the system gApp, registered by the University of Malaga (intellectual property registration) in Safe Creative (https://www.safecreative.org/work/2011165898461-gapp).
Human & Machine Translation Quality: Comparing & Contrasting Concepts
Quality assurance is a central component of both human and machine translation with different points of view from the perspective of Translation Studies (TS) and the field of Machine Translation (MT). Whereas Translation Studies focuses on the purpose, on pragmatic aspects of translation as well as on comprehensibility, Quality Assessment (QA) in the field of Machine Translation includes QA frameworks for assessment by humans, consisting of manual error classification, and by machines, e.g. comparing the translation to a reference translation.
In an attempt to bridge the gap between the two fields, this paper focuses on comparing and contrasting central concepts of assessing translation quality in both fields, TS and MT, providing an overview and description of overlapping quality concepts of the two fields, based on an extensive systematic literature review on translation quality (assessment). The detailed descriptions and comparisons of the points of view from both sides will provide a valuable means of reference for the points of intersection of the fields of translation studies and machine translation.
Bettina Hiebl is a PhD Student in Transcultural Communication at the University of Vienna, who in her dissertation works on a comparative analysis and practical framework on translation quality from the perspective of translation studies and computational linguistics. She graduated from the master’s program Interpreting
(German-Italian-English) at the University of Vienna in June 2011 with a master’s thesis on simultaneous consecutive interpreting with the LivescribeTM EchoTM smartpen. To complement her studies in the field of translation, she also completed the diploma program in Comparative Literature at the University of Vienna from which she graduated in November 2010 and the bachelor’s program in Business, Economics and Social Sciences, Major in Business Administration, at the Vienna University of Economics and Business, from which she graduated in January 2014.
She has been working as a freelance translator and interpreter since February 2012. In addition, she has been a part-time employee as the project coordinator of the Vienna School of Mathematics, the joint doctoral school of the mathematics faculties of the University of Vienna and TU Wien since 2016 as well as the project coordinator for the Vienna Graduate School on Computational Optimization at the Faculty of Mathematics of the University of Vienna since 2020.
Her research interests are all aspects of human and machine translation quality, especially translation quality at the intersection of the fields of translation studies and machine translation as well as the industry.
In 2013 she received the Promotion Award for Junior Researchers of the professional group for Vienna of Service Providers/Language Service Providers of the Austrian Economic Chamber (WKO) for her master’s thesis. In 2023, she became a member of the European Association for Machine Translation, and received a bursary for presenting her work at the EAMT 2023 conference in Tampere, Finland.
Dagmar Gromann (ORCID: 0000-0003-0929-6103) is Assistant Professor Tenure Track at the Centre for Translation Studies of the University of Vienna with prior post-doc experience at TU Dresden and the Artificial Intelligence Research Institute (IIIA-CSIC) in Barcelona, Spain. Her research focuses on language technologies, in particular machine translation and multilingual knowledge extraction, and their socio-technical implications, such as gender bias beyond a binary conception of gender. As a computational linguist working in a translation studies department, she is particularly interested in bringing both communities together in a mutually beneficial way. She was responsible for creating a new master’s curriculum on Multilingual Technologies for translation and computer science students.
She was leader of a European Language Grid (ELG)-funded project Extracting Terminological Concept Systems from Natural Language Text (Text2TCS https://text2tcs.univie.ac.at/), which is available as a service on the ELG platform, and the Center for Technology and Society (CTS)-funded project GenderFairMT (https://genderfair.univie.ac.at).
Currently, she is vice chair and working group leader of the COST Action NexusLinguarum (CA18209) on Web-centred linguistic data science (https://nexuslinguarum.eu/) and secondary partner in the Language Data Spaces (LDS) EU project lead by DFKI.
She is on the editorial board of the Semantic Web Journal, the Journal of Applied Ontology, and the newly created journal on Neurosymbolic Artificial Intelligence. She has co-authored more than 70 peer-reviewed publications and has helped organize numerous international conferences and workshops, currently organizing the 4th Conference on Language, Data and Knowledge (LDK) in Vienna this September.
Human-Centred Machine Translation via Interaction Design
Recent advances in language technologies have grown exponentially in recent years, and the impact that they are having on our society is changing faster and faster. Machine translation (MT) is one of the technologies that has had the greatest impact. Consequently, the spectrum of MT users grows: professional translators, people with no knowledge of a language who need to understand a text, academics writing in a language that is not their L1, etc. Despite the vital importance of MT, the research focus has been on the productivity and quality offered by the MT. In the human-machine interactions of using MT, one element has been often overlooked in research and industry: the user. What does any MT user feel or experience when interacting with MT systems? Is this experience appropriate to what the user is looking for? Do developers of new products take users’ needs into account? Through a transdisciplinary perspective, drawing elements from TS, HCI and MT, we shift the focus from productivity and quality towards what the MT user experiences and feels during their interaction with MT by analysing the Machine Translation User Experience (MTUX) concept as a tool to measure the user experience when interacting with MT.
Vicent Briva-Iglesias holds a BA (Hons) in Translation and Interpreting from the Universitat Jaume I (Spain), as well as an MSc (Hons) in Translation Technologies from the Universitat Autònoma de Barcelona (Spain). Currently, he is pursuing a PhD on MT and HCI at the School of Applied Language and Intercultural Studies (SALIS) at Dublin City University, and he is funded by the SFI Research Centre for Digitally-Enhanced Reality (D-REAL).
Regarding academic experience, Vicent lectures the “Professional Orientation for Translators” and “Introduction to Python Programming for Linguists” modules at the Universitat Oberta de Catalunya in the Master’s Degree of Translation and Technologies, and co-lectures the “Localisation” module at Dublin City University.
He has also presented his research in different international conferences (e.g., AMTA, IATIS, NeTTT, Translating and the Computer) and has some publications in conference proceedings (NAACL, EAMT) and journals (Procesamiento del Lenguaje Natural, Tradumàtica, Mutatis Mutandis or Translation, Cognition and Behavior) in the MT, Translation Studies, HCI or NLP fields.
Vicent’s interest in the academic aspects of translation is heavily influenced, and, in a way, enhanced, by his professional approach and practice of translation. He runs AWORDZ Language Engineering LTD, a small language service provider, and has worked as a professional language engineer for 5+ years. In addition, he is an English, Spanish, Catalan, and Valencian certified sworn translator, appointed by the Government of Catalonia.
Integrating ChatGPT in academia and in the office – a case study
Recent technological developments are having a great impact in the translation world. Institutions offering translator training programmes are called upon to catch up by modifying their curricula accordingly, without, however, having a clear idea of how these developments will unfold in the near future. Freelancers and small translation companies are also called to step up and adapt to the new, fluid, reality. The current presentation forms part of a larger study on the status of AI in Greek-speaking translation landscape. It reports on work-in-progress focusing on whether translator trainers as well as freelancers/small translation companies in Greece and Cyprus have started integrating ChatGPT in their programmes or workflows, respectively, and how by means of questionnaires, as a first step to detect areas where both groups might collaborate toward paving a (more) rewarding future for everyone involved.
Kyriaki Kourouni is a Senior Fellow at the Department of Translation and Intercultural Studies. She holds a BA (Hons) from Aristotle University, an MA in Translation from the University of Surrey, UK as well as a DEA in Translation and Intercultural Studies and a European Doctorate cum laude from Universitat Rovira i Virgili, Spain. She teaches courses related to scientific and technical translation, translation technology and translation studies. She has over 10 years experience in translation and subtitling. Her research interests include translator training and translation technology.
She has served as Vice-President of the Panhellenic Association of Translators (www.pem.gr, 2008-2010), as a member of the Translation Technology Committee set up by the International Federation of Translators (www.fit-ift.org, 2010-2012), as Board Member of the Hellenic Society for Translation Studies (http://hst-translationstudies.gr, 2017-2019) and as Board Member of the European Society for Translation Studies (EST, http://www.est-translationstudies.org, 2013-2016, 2016-2019 as Vice-President, 2019-2022). She is currently Board Member of the Greek Applied Linguistics Association (GALA, https://www.new.enl.auth.gr/gala/, 2021-2023).
Cristina Toledo Báez
Laura Noriega Santiáñez
Introducing the GAMETRAPP project: app for post-editing neural machine translation using gamification
The world of translation has experienced a tremendous change with the emergence of neural machine translation (NMT) systems, which have reshaped multiple professional realities in different fields. Their arrival has developed new ways of conceiving translation practice and has led to the birth of post-editing (PE). Our project is framed within the multilingual context and the need to disseminate science and new research advances, especially in English. The GAMETRAPP project is funded by the Spanish Ministry for Science and Innovation (TED2021-129789B-I00) and its main goal is to bring the NMT + full PE of abstracts closer to researchers of multiple fields using a gamified environment. Gamification is an increasingly popular learning technique that helps and motives the student to learn by means of playful activities. In this way, this article is structured in 5 sections. Section 1 introduces the topic of NMT + full PE and gamification, followed by section 2, in which basis of the project is explained. Section 3 describes the protocolised methodology we are following for the classification of abstracts. Section 4 briefly explains the gamified environment we are setting. This will lead us to section 5, in which the conclusions of our research are drawn.
Cristina Toledo Báez
Cristina Toledo-Báez is an Assistant Professor at the Department of Translation and Interpreting of the University of Málaga (Spain). She has been awarded several predoctoral and postdoctoral research grants for Copenhagen Business School (Denmark), University of Wolverhampton (United Kingdom), Dickinson College (USA)
and Centre Privé de Langues (France). Her current research interest is related to machine translation and post-editing. She is leading two research projects: NEUROTRAD, on human-machine parity and neural machine translation, and GAMETRAPP, on training for post-editing neural machine translation using gamification.
Laura Noriega Santiáñez
Laura Noriega-Santiáñez has a BA in Translation and Interpreting (English and French) from the European University of the Atlantic (2019) and an MA in Translation for the Publishing Industry from the University of Malaga (2020). She has worked as a Lecturer in the Department of Translation and Interpreting at the University of Malaga, and as a translator, proofreader, and editor. She is currently a PhD candidate and a research staff in the GAMETRAPP project (ref. TED2021-129789B-I00). Her research interests include literary translation, corpus linguistics, phraseology and translation and interpreting technologies.
Is Catalan ready to become an EU official language? Terminological Resources and Project IATE-CAT
Due to recent political agreements in Spain, Catalan, the official language of Catalonia, is on the brink of becoming an official language in the EU. However, in order for this transition to be successful, it is imperative to have comprehensive terminological resources that cover all fields and allow for the accurate translation of all EU official documents into Catalan. One project that is currently addressing this need is IATE-CAT, the Terminology of IATE in Catalan. IATE (Interactive Terminology for Europe) is the terminology management system for the EU. The IATE-CAT project aims to increase and compile terminology specifically for the translation of the Acquis Communautaire into Catalan. To achieve this goal, the methodology applied on this project combines compilation of linguistic resources available in open source from different sources of information with the creation of specialized parallel corpora, mainly Catalan-Spanish corpora, using natural language processing tools in order to process, validate and publish results in an open source format. The IATE-CAT project is a collaborative effort between the Open University of Catalonia and the Centre for Terminology TERMCAT, the official body in charge of Catalan language terminology.
Mercè Vàzquez holds a PhD in Cognitive Science and Language from the University of
Pompeu Fabra, as well as bachelor’s degrees in Information Science from the Open
University of Catalonia and in Catalan Language and Literature from the University of
Barcelona. She is a lecturer in the Faculty of Arts and Humanities at the Open University of
Catalonia. The focus for her research is automatic terminology extraction, corpus linguistics,
translation and linguistic analysis.
Antoni Oliver is an associate professor at the Universitat Oberta de Catalunya (UOC) and
the director of the master’s degree on Translation and technologies. He holds a PhD in
Linguistics from the Universitat de Barcelona, master’s degree in Free software from the
Open University of Catalonia, a bachelor’s degree in Slavonic Philology from the Universitat
de Barcelona, and Telecommunications engineer degree from the Universitat Politècnica de
Catalunya. His main areas of research are machine translation and terminology extraction.
Sergi Alvarez-Vidal holds a PhD in Translation and Language Sciences from Universitat
Pompeu Fabra. He is currently an Adjunct Professor at Universitat Oberta de Catalunya
(UOC). He has worked as a freelance translator for more than 15 years, specialized in
technical translation and localization. His research focuses on how MT can affect
translations and translators, mainly studying post-editing and its effect on the translation
Leveraging AI for Term Extraction: An experiment from the European Commission’s DG Translation
The Directorate-General for Translation (DGT) of the European Commission has been investigating the application of artificial intelligence in extracting terms to enhance translation workflows, refine terminology management, and promote advancements in the field. In DGT, we identified several use cases for incorporating monolingual English terminology extraction into the translation process. These include conducting proactive terminology work, enabling translators to assess text complexity, and integrating term lists into Computer-Assisted Translation tools. Potential future machine-to-machine use cases include improving machine translation, enriching documents with metadata, and enhancing existing term extraction mechanisms.
We developed a terminology recognition tool named ‘TermiteOne,’ which relies on an LSTM-CRF model from the open-source Flair NLP library and is trained using European Union legislation. The tool was compared against SynchroTerm and assessed by terminologists and translators, yielding positive results across domains and producing little noise. A second model was trained using a different corpus, capturing a higher number of terms. Both models will be utilised together by the tool to avoid overlooking potential terms of interest. Additionally, a large language model (GPT-4) is being used for term extraction and comparison with the smaller specialised models. The application of AI for term extraction appears highly promising and warrants further investigation.
Borislav Gueorguiev is a translator and Language Technology Coordinator at the Bulgarian Language Department, Directorate General for Translation, European Commission. He is also a hands-on AI explorer as a member of AI Network at DGT, an AI exploratory initiative within the DGT’s Innovation Lab.
His background as a software developer, combined with his current job, naturally led him towards utilising Artificial Intelligence (AI) for Natural Language Processing (NLP) to enhance the translation processes.
Matej Vlaciha is a translator and Language Technology Coordinator at the Slovak Language Department, Directorate General for Translation, European Commission. He is also a member of a group of hands-on AI explorers within DGT’s AI Network, which in turn falls under the grassroots Innovation Lab.
As a Language Technology Coordinator, he has cultivated a strong understanding of Computer-Assisted Translation (CAT) tools, and he has developed a keen interest in Artificial Intelligence (AI), particularly in the field of Natural Language Processing (NLP) to improve the translation processes.
Live speech-to-text and machine translation tool for 24 languages – the project, its implementation and lessons learned
Looping Subject Matter Experts (SMEs) into Your Translation Process
With the increased fluency of automated translation, whether it be through Neural Machine Translation, Large Language Models or other forms of artificial intelligence, the translator’s role is shifting. For highly specialized domains, a bilingual SME may be required rather than a linguist. For more creative domains, a local native language speaker may need to review.
Traditional translation productivity tools were built for linguists, who understand translation memories, termbases, TQA, and similar features. These are overly complex for Experts-in-the-loop, with no linguistic or technology training, but who are required for their domain knowledge to review the translation results.
Follow our journey of collaboration with industry partner Sika to create different tooling to better serve this increasingly common use case.
Matthias Heyn has been a pioneer of Trados translation technologies since 1992 focusing on multilingual processes and related technologies in a wide range of industries. Working across product engineering and business development positions from start-up to large scale organizations, he has focused on public-sector organizations in the EU and big pharma in the area of multilingual regulatory processes. With a long-standing interest in Cloud facilitated SaaS delivery models, he has been involved with Trados Cloud since 2019.
Multilingual Terminology and Institutional Translation in Europe: The Role of Thematic Termbases in Computer-Assisted Translation
Translators face the challenge of maintaining terminology consistency throughout their work, particularly in institutional translation. Efficient reproduction, dissemination, and reuse of expanding data volumes are crucial in specialised translation and knowledge management. Multilingual reference data, such as termbases, taxonomies, thesauri and ontologies, play a vital role in achieving this. Termbases are curated collections of terms and translations or definitions, created semi-manually by terminologists. They are important for several reasons. Firstly, they ensure terminological consistency by providing pre-approved terms, promoting uniformity in translations. Secondly, accuracy and precision are ensured, which is crucial in legislative translation, as even slight variations in terminology can significantly alter the meaning. Thirdly, termbases save time by reducing the need for extensive research to find specialised term translations. Lastly, they foster knowledge-sharing and collaboration among translators through accessible termbases. This proposal emphasises the importance of EU open data initiatives in promoting transparency and collaboration in multilingual institutional translation and legislative terminology, with the Terminology Coordination Unit of the European Parliament leading the way in providing freely accessible, targeted termbases. Effective termbase management strategies enable translators to enhance the quality and accuracy of their legislative translations while facilitating knowledge exchange and cooperation among the EU’s institutions.
Victoria Saura-Montesinos is responsible for the preparation of termbases and other terminology tasks in the Terminology Coordination Unit (Directorate-General for Translation) of the European Parliament and member of the research project “Applications of Digital Linguistics to the Field of Terminology: The Creation of a Bilingual Relational Lexicon of Terminological Uses of Lexical Semantics (TerLexWeb)” (2023-2027) of the Institute of Applied Linguistics (ILA), in the framework of the State Subprogram for Knowledge Generation and funded by the Spanish State Research Agency (AEI). She is currently involved on a PhD in Terminology at the University of Cadiz (Spain), where she also graduated in Linguistics and Applied Languages and English Studies. Her research areas are Terminology and Terminography, Lexical Semantics and Corpus Linguistics.
Navigating Through Different Flavors of Machine Translation: Free, Custom, Fine-Tuned, Generic, Large Language Model-Based?: A Comparative Evaluation Study
Machine translation (MT) systems, ranging from neural MT platforms like Google Translate, and the European Commission’s eTranslation to large language models (LLMs) such as GPT-4, have become ubiquitous. Additionally, tools like OPUS-CAT provide access to free pre-trained NMT models and offer fine-tuning capabilities using custom translation memories. Amidst this vast landscape and the recent hype around LLM-based MT, our research seeks to assess the translation quality of various systems in the field of localization to help professional translators give informed decisions about their MT use. Building on our prior work, which examined the fine-tuning of MT engines across three language pairs (English-Turkish, English-Spanish, and English-Catalan) using different localization corpus sizes, our current study contrasts top-performing fine-tuned engines from the previous study with outputs from platforms like Google Translate, eTranslation, GPT-3.5, and GPT-4. For English-Spanish and English-Turkish pairs, we include all the aforementioned systems, while for English-Catalan, we compare Google Translate, GPT-3.5, GPT-4, and Softcatalà, noting that eTranslation does not support Catalan. Our methodology involves translating 210 sentences from localization domain using these MT systems. We then evaluate the translations using four automatic metrics: BLEU, ChrF, TER, and COMET, available on the MATEO platform. We report preliminary findings across language pairs.
Gokhan Dogru is a visiting postdoctoral researcher at ADAPT-DCU affiliated with the Faculty of Translation and Interpreting at Universitat Autònoma de Barcelona (UAB) in the framework of Margarita Salas Grant. His research interests include terminological quality evaluation in machine translation, different use cases of MT for professional translators and the intersection of translation profession and translation technologies as well as localization.
PracticeAI: Leveraging LLMs and Speech Synthesis for Material Generation in Interpreting Training
Studying the Need to Optimize Researching Amendments and Corrigenda in EU Institutional Translation
To comply with EU institutional translation standards, linguists must carefully research amendments and corrigenda to ensure accuracy and consistency. Our study explores the importance of consolidation: the action of combining an initial act with its subsequent amendments and corrections in a single consolidated document.
Firstly, we discuss specific translation scenarios where it is critical to consult consolidated documents, also corrigenda and amendments not covered by consolidation, and highlight the challenges they present.
Secondly, we provide statistics on the proportion of documents affected by modifications and/or consolidation in a fundamental segment of the EU legislative corpus. We examine the set of regulations, directives, and decisions adopted as basic acts by the ordinary legislative procedure, drawing statistics on the extent of consolidation, also on unincorporated amendments and corrigenda. We find that the majority of the examined regulations and directives have a consolidated version, and that non-consolidated modifications in this segment are rare.
Our results underline the need for careful and laborious research on the history of reference documents and their metadata. We aim to improve this process through our online concordance tool Juremy.com, by displaying metadata on consolidation and corrigenda, and thereby further supporting linguists in obtaining high translation quality.
Tímea is a lawyer and a registered member of the Hungarian Bar Association since 2007. Her expertise covers various aspects of intellectual property including copyright, software licensing, databases, trademark and data protection matters. Tímea holds an LL.M. (Master of Laws) degree in Information & Communication Technology Law. She also holds a bachelor degree in international communication studies.
Tímea also has 8 years of experience as a freelance lawyer-linguist for the Court of Justice of the European Union in FR-HU and EN-HU language pairs. She has been involved in legal translation projects since the beginning of her professional activity as a lawyer.
She is the co-founder and managing partner of the online application Juremy.com EU Terminology Search, which was launched in 2019 to support linguists’ EU terminology research workflow. Tímea is passionate about legal and terminology research, legal language and translation productivity.
Robin is an all-round software engineer with 15 years of work experience (gross). He acquired an MSc in Computer Science and Engineering at the Budapest University of Technology, majoring in Artificial Intelligence (the old school kind). He investigated weird bugs at Ericsson, developed search backend software at EPAM, and briefly pursued an academic path in network science analyzing protein interaction networks before moving to work at Google Switzerland.
He is passionate about graph analysis and other interesting algorithms, engineering infrastructure, education and productivity, and beyond all, building user-centric features. Robin is the co-founder of the EU terminology concordance search tool Juremy.com, and also the lead developer of the tool’s functionalities since its launch.
The presenters will also be moderating the Workshop “Terminological accuracy and translation quality by efficient EU Corpus research: using Juremy in the Post-Editing process” at TC45.
Subtitling in 24 languages at the European Parliament: the making of a high quality service
At the European Parliament, we subtitle speeches, documentaries, infographic videos and interviews in 24 languages. We use clear language concepts to bring Parliament’s multimedia content into every living room and onto every smartphone in all 24 official EU languages, so that no citizen feels left behind.
To that end, in 2020 we built a service from scratch, harnessing the power and potential of technology with one thing in mind: a service that enables us to deliver subtitling of the highest quality for our multilingual and multicultural audience, from all 27 Member States. This was done by not only looking at the future of subtitling and audiovisual translation in general, but also learning from the history of subtitling in the commercial industry.
In the course of this presentation, we will first outline how we built this service, and will then focus on the dimensions of quality management, assurance and control that were adopted in the subtitling process. More specifically, the presentation will look at three key indicators of process quality: workflows, technology, and the quality of source and reference materials. Finally, we will outline the key lessons learned, and provide recommendations on how to produce and deliver high quality subtitling quality on a large scale.
Dr Irene Artegiani is an Italian audiovisual translator, subtitler and researcher. She completed a degree in Translation and Interpreting from the University of Forlì (Bologna) and MA in Audiovisual Translation from the University of Roehampton. In 2021 she obtained her PhD on quality and subtitling practices in the industry from the University of Roehampton. Her research interests include translation technology and its ideological implications, as well as translation processes. She has worked in academic positions as assistant and visiting Lecturer in Translation Studies, and has extensive experience as a freelance translator, especially in the subtitling of documentaries and independent productions. Irene joined the European Parliament’s Subtitling and Voice-over Unit this year as a project manager.
Subtitling videos within a language service: a hands-on approach
The internal language service of “Vaudoise Insurance”, a Swiss insurance company, translates, copywrites and proofreads over 1,900 mandates a year. Thanks to its visibility strategy, it successfully positioned itself as the sole purveyor of subtitles.
Before drafting processes for video subtitling, we conducted a benchmark of current tools and technologies. We wanted a user-friendly, cheap, easy-to-install desktop tool, if possible, with speech recognition. We settled on the free open-source tool Subtitle Edit.
We follow two processes depending on the type of videos:
1. Videos with scripts only require spotting, revision, and export steps.
2. Videos without scripts require additional speech recognition and translation/spotting steps, followed by revision and export.
For interlingual videos, we translate directly while using the spotting from the intralingual video, then do revision and export steps. If there is no intralingual version, we compress translating and spotting into one step. Also, no previous transcription is needed in this case; the content is mostly colloquial and easy to translate.
For our intents and purposes, Subtitle Edit does the trick. The process is optimized and easily understood by our translator colleagues, and clients. Better speech recognition for Swiss German would be a plus in the future.
Samuel Urscheler studied specialized translation at University of Geneva in 2016. After internships at the Federal Chancellery of Switzerland and Swiss Post, he became a translator at Vaudoise Insurance in 2018. Over his career at Vaudoise Insurance and after completion of an advanced subtitling workshop, he has become the team’s leading expert for subtitles. His target language is German, his working languages are French, English, Spanish and Italian with a focus on legal translation. In addition to his part-time occupation as a translator, he is also a professional musician, performing as a saxophonist and composer.
Sandra Casas is a translation technology specialist and translator. She studied at the University of Geneva, with a main focus on machine translation and post-editing, especially teaching post-editing to translation students. At Vaudoise Insurance, she is in charge of managing the translation workflow and providing maintenance for the various tools used (CAT tools, TMS, own NMT system, etc.). She is also translating from German, English, Spanish and Italian into French. She works closely with Samuel Urscheler for subtitling requests and provides the French subtitles.
Term translation: convert or converse?
A well-known challenge that machine translation (MT) faces is the accurate translation of domain-specific terminology. While various methods have been suggested to address this challenge, they all come with limitations. More recently, the use of pre-trained language models like GPT for various natural language processing tasks, including MT, has gained significant attention, yet the potential of these models for terminology translation remains relatively unexplored. Therefore, we assess the potential of ChatGPT, a system that converses with a user: (1) in the translation task (without and with context, such as a glossary), we compare its results to those of an MT system, which converts sequences to sequences, and (2) in the post-edition task, ChatGPT refines MT output. Manual and automated evaluations indicate that ChatGPT is outperformed by the MT system when no context is provided, and that its results significantly improve given context, reducing literal translation errors. Moreover, the post-edition results show its potential for enhancing translation accuracy, particularly in specialized domains. Nevertheless, occasional shifts in meaning and agreement errors suggest room for improvement. During the experiments, we focus on the English-Russian and English-French language pairs.
Aida Kostikova began her professional journey as a translator, igniting a passion for understanding languages. She pursued studies in Intercultural Communication in Politics and Diplomacy at Moscow State University, emphasizing the importance of interdisciplinary insights. Further expanding her expertise, she attended the European Masters in Technologies for Translation and Interpreting (EM TTI) Programme at New Bulgarian University and Ghent University, where she delved into natural language processing (NLP). Presently, at Bielefeld University, her research revolves around large language models (LLMs), examining their applications, strengths, and limitations — particularly in the realms of social and political sciences. As she works toward enhancing LLMs for accuracy, reliability, and sustainability, she continues to contribute to the advancement of this field.
Kristin Migdisi holds a Master’s degree in Translation at the Vrije Universiteit Brussel and a Postgraduate degree in Computer-Assisted Language Mediation at the University of Ghent. She joined CrossLang in October 2022 as a Junior Language Specialist, mainly focusing on customisation of MT engines.
Sara Szoc is a Senior Language Engineer at CrossLang, with a PhD in Linguistics from the University of Leuven. She specializes in machine translation (MT) engine customization and NLP tool development, coordinating a dedicated team for various MT projects.
Tom Vanallemeersch holds a PhD in computational linguistics from the University of Leuven. At CrossLang, he customizes MT systems, provides consultancy, and manages publicly funded projects. Previous activities include development at Systran and DG Translation of the European Commission and coordination of a terminology extraction project at Dutch Language Union.
To train or not to train? Evaluating and analyzing both human and automatically different outputs with high-quality generic and custom engines
Machine Translation (MT) has become an essential tool for communication across languages in today’s world. Many customers are exploring the benefits of implementing MT in their workflows, often customizing and training generic engines to suit their needs. However, the process of customizing and training MT engines may not always be clear to customers who have limited knowledge of MT. This paper aims to explain the basic differences between generic and custom MT engines, and to examine whether training a MT engine is worthwhile when excellent generic providers are available. The study conducts a comparative analysis using two MT systems: a customer-specific engine and a selected generic provider in a specific domain. The analysis includes both human and automatic metrics to evaluate translation quality, including accuracy, fluency, etc. The article presents a comprehensive analysis of the results, considering the trade-offs between customer-specific MT engines and generic providers in terms of quality, cost, and time. The findings contribute to the ongoing discussion about the viability of training an MT engine for customers with access to high-quality generic providers. In conclusion, this article provides a practical example of evaluating the necessity of training a customer-specific MT engine when excellent generic providers are available.
Cristina Cano Fernández is a PhD researcher at the University of Alcala (Madrid), under the Program of Modern Languages. She belongs to the research group FITISPos-UAH (Translation and Interpreting for the Public Services Training and Research) since 2020. Her main field of research is the implementation of translation technology in different workflows, especially those of the third social sector (migration contexts). Currently, Cristina is working as Product Manager and Solutions Architect for the Spanish Market at Trip.com. Her previous experience includes working as Machine Translation Specialist and Product Manager at Acolad, by integrating translation technology into diverse customers’ workflows.
Towards a Free, Web-Based Workbench for Speech-Enabled Translation and Post-Editing: Speech Integrated COPECO
We present our work-in-progress: a speech enabled web based translation/post-editing workbench. Previous research on speech-based translation/post-editing have used approaches demanding specific hardware setups and/or relied on commercial licensed software: e.g. Dragon, Trados Studio. These observations motivated us to create a free workbench for spoken post-editing/ translation. This led to our work-in-progress: a web-based workbench which supports speech based translation and post-editing. It also is available for unrestricted use by anyone (requiring only a user account). We integrated speech recognition as well as custom post-editing speech commands in COPECO platform. This allows translators to practice speech technology in translation/post-editing. It also helps translation trainers to understand common mistakes that translators encounter when using speech technology. Trainers can then use that knowledge to train student translators. The platform also aids as a resource in analysing speech-based translation and post-editing patterns During our presentation, we expect to make our speech enabled translation workbench (currently on Google Speech and allows English and French spoken translation/post-editing) available to interested audiences as an initial step, allowing them to easily explore and learn how Speech-Enabled Translation and Post-Editing can be used in their day to day translation activities and workflows.
Jeevanthi Liyanapathirana is a PhD student at the Faculty of Translation and Interpreting, University of Geneva, where her research question lies on incorporating speech technologies for translation and post editing purposes. She has been a fellow in translation technology as well as a translation technologist in the World Intellectual Property Organization, Geneva and is currently working as a Document and Translation Technologies Specialist at World Trade Organization, Geneva, Switzerland. She holds a Masters of Philosophy in Computational Linguistics from the University of Cambridge, UK (MPhil in Computer Speech, Text and Internet Technology) and a Bachelor of Science (Computer Science Special Degree) from the University of Colombo, Sri Lanka. She has participated in multiple EU projects, Swiss National Science Foundation projects as well as South Asian Localization projects involving machine translation, speech recognition and Computational Linguistics in general. She has worked as a research intern in machine translation at Idiap Research Institute, Switzerland as well as at Language Technology Research Laboratory at University of Colombo where she worked as research assistant in Computational Linguistics. Currently, she is also a member of the Bibliomics and Text Mining Group at the University of Applied Sciences, Geneva.
Pierrette Bouillon has been Professor at the Faculty of Translation and Interpreting (FTI), University of Geneva since 2007. She is currently Director of the Department of Translation Technology (referred to by its French acronym TIM) and Dean of the FTI. She has numerous publications in computational linguistics and natural language processing, particularly within speech-to-speech machine translation, accessibility and pre-editing/post-editing.
Jonathan David Mutal
Jonathan David Mutal is a Research and Teaching Assistant at the Department of Translation Technology (referred to by its French acronym TIM). His research interests concentrate on neural machine translation, machine learning, natural language processing and evaluation. He is a strong advocate of producing research to bridge the gap between academia and business. Jonathan holds a BSc (5 years degree) in Computer Science and his master thesis consisted of an ongoing academia-industry collaboration that aims to integrate MT into the workflow of a big language service provider. The thesis describe the evaluations carried out to select an MT tool (commercial or open-source) and assess the suitability of machine translation for post-editing in the LSP’s various subject areas and language pairs.
Translation difficulty in literary machine translation
This research aims to investigate the difficulty of literary machine translation (MT) for the English-Ukrainian language pair. Specifically, it will look into source text (ST) characteristics that make MT of literary texts difficult by focusing on five English novels and their translations into Ukrainian. In order to fulfil the research objective, we will measure ST complexity of the novels, generate their MTs by five MT systems, and analyse their outputs by conducting an automatic evaluation. This will help us measure the difficulty of MT for literary texts and compare between texts of varying levels of complexity. We anticipate that the results of this study can help assess the feasibility of using MT in the real-world workflow of literary translation for the English-Ukrainian language combination.
Anastasiia Vestel is a current student at the European Master’s in Technology for Translation and Interpreting (EM-TTI) programme and a member of the European Association for Machine Translation (EAMT). She has completed the first year of her Master’s at Ghent University, Belgium, and is currently doing her second year at the University of Málaga, Spain. Her research interests include translation technology and natural language processing, specifically machine translation (MT) for literary texts, particularly for low-resource languages, and comparison of translation difficulty between various types of texts and different modalities (human translation versus MT/post-editing).
Lieve Macken is Associate Professor of Translation Technology at the Department of Translation, Interpreting and Communication of Ghent University (Belgium), where she also teaches Machine Translation. She has strong expertise in multilingual natural language processing. Her main research focus is translation technology and more specifically the comparison of different methods of translation (human vs. post-editing, human vs. computer-aided translation), translation quality assessment, and quality estimation for machine translation.
She was guest editor of the Special Issue “Advances in Computer-Aided Translation Technology”, of the peer-reviewed journal Informatics (2019). She collaborated with the Directorate-General for Translation (DGT) of the European Commission to examine the impact of MT on the translation workflow at DGT. At Ghent University, she coordinates the Computer-Assisted Language Mediation postgraduate programme and the European Master’s in Technology for Translation and Interpreting (EM-TTI).
Translation methods applied while dealing with system-bound terms (Polish-English translation) – case study
The research aims at discussing Polish and British incongruent terms that refer to company law. The Polish terms under analysis appear in the Polish Code of Commercial Partnerships and Companies and constitute legal terms according to the definition by Moroz. The English equivalents of each Polish term under research appear in two Polish Code of Commercial Partnerships and Companies translations into English. The theoretical part of the paper includes the presentation of the definitions of a system-bound term. The research problem is to verify whether the published typology of translation methods used in the Polish-English translation of succession and family law terms (which are civil law terms)comprehends translation methods applied while translating company law terms into English.The stages of the research include 1) presentation of a definition of a Polish term, 2) enumerating the so-far published English equivalents of a given Polish term and comparing their definitions (as long as they appear in English law dictionaries ) with the definition of a given Polish term under analysis, 3) checking whether an English equivalent appears or not in, among others, the sources of the British law , 4) identifying the translation method that was applied while forming a given English equivalent.
Anna Kizinska, assistant professor at the University of Warsaw, Faculty of Applied Linguistics, holds a PhD in linguistics (translation studies), MA in law; translator of legal texts, author of monographies and reserach papers on legal texts translation.
Haifa Ben Naji
Ahmed Elhuseiny Bedeir
Updating translator education programs: Adapting to technologies and their impacts in the Canadian language industry
Perceptions of language technologies and how they are changing the language industry are having undeniable and striking effects on translator education programs. Institutions must not only update and adapt curricula to meet the needs of today and tomorrow, but also take into account the perceptions of potential students (and others) in order to attract new translators to training programs and ultimately to the industry. After beginning with a review of the literature and profiles and experiences of some other translator education programs, we will use programs at the University of Ottawa, Canada, as a case study. Gathering data through documents, interviews and questionnaires, we will examine how the general public, potential students, current students, alumni and employers may perceive the effects of technologies on the language professions, and how programs may be called to adapt by introducing new technologies, changing the focus of teaching, and introducing new skills to ensure that graduates are prepared to enter and thrive in the language industry.
Elizabeth Marshman is an Associate Professor at the University of Ottawa School of Translation and Interpretation, and a member of the Observatoire de linguistique Sens-Texte. Her research interests include user perspectives on translation technologies, technology teaching, and computer-assisted terminology.
Anwar Alfetlawi is a PhD student at the University of Ottawa, School of Translation and Interpretation. He is also an experienced freelance translator and an ESL instructor. His main research interests include simultaneous interpretation, translation technology, and the integration of educational technology in ESL classes.
Haifa Ben Naji is a PhD student at the University of Ottawa, School of Translation and Interpretation. Her research interests include terminology management in commercial environments, localization, and translation technology teaching.
Dipen Dave is an MSc Student specializing in marketing and behavioural science at the University of Ottawa, Telfer School of Management. His main research interests include brand management, consumer behavioural studies, and word of mouth marketing.
Ahmed Elhuseiny Bedeir is a PhD candidate at the University of Ottawa, School of Translation and Interpretation, and an English-Arabic translator certified by the American Translators Association and the Association of Translators and Interpreters of Ontario. His research interests include translation technology teaching and the use of translation in foreign language classes.
Ting Liu is a PhD student at the University of Ottawa School of Translation and Interpretation. Her research interests include translation pedagogy and translation technologies.
The use of speech technologies and machine translation in institutional translation practices
Using embeddings to optimise translation memory usage
The developments in the area of GPT-like Large Language Models (LLMs) have resulted in a wide range of benefits for researchers, digital content creators and linguists in general. One of the main benefits are significant advances in the area of text embeddings used for similarity search. Embeddings are numeric representations of text that are capable of grasping the semantic layer – the actual meaning of text. For example, the following two sentences are considered similar when using embeddings:
“They will send the first version of the document for acceptance” “It shall submit a draft of the protocols for conclusion to the Council.”
Similarities of that kind are impossible to identify using the classic fuzzy search techniques due to low lexical overlap.
Using text embeddings for similarity analysis is a way to maximise the gains of using a classic translation memory. This allows for improved fuzzy matching and a new, more effective quality-assurance technique. It also enables offline translation memory analysis.
Rafał Jaworski, PhD, works as a Linguistic AI Expert at XTM International. He is an
academic lecturer and scientist specialising in natural language processing techniques. He
develops robust AI algorithms for the needs of computer assisted translation. These include,
among others, automatic lookup of linguistic resources and automatic post editing. At XTM
International he leads a team of young and talented AI specialists who put his visions and
ideas into practice.
Using Large Language Models (LLMs) to improve Quality Assessment
Quality assessment can be one of the most manual, time-consuming, and contentious steps in the translation workflow. Can Large Language Models (LLMs) and AI be used to streamline this process?
There are many considerations, not the least of which are data privacy, accuracy, cost, and unbiased training data for the LLM. This research covers building an accurate data set for training, whether a self-built, privately hosted LLM is fit for this purpose, the costs associated with building such an LLM, and the usage of this LLM to automate components of the review process.
Sponsors' Leadership Talk Presenters
Juan Castro (?)
Title of Leadership talk to follow here
This speaker will also be moderating the two sponsored workshops: “LogiTerm Web and the Plug-in for Research” and “Terminology and Alignment Management“
[Abstract for Leadership talk to follow]
Having a background in automated productivity and in electronics, I initially started my career in 1985 as a Computer Aided Design (CAD) technician before devoting myself to the field of computer aided translation tools (CAT) in 1994. Since then, and through different experiences of work, I acquired expertise in translation memory systems, terminology extraction tools, terminology management, machine translation, full text, bitexts and project management systems.
In 1994, I founded BridgeTerm (Now a Translation Management System) which acted as CAT broker. During my career and over the years, I developed and improved SynchroTerm, a bilingual terminology extraction tool allowing to feed terminology databases from translated documents.
In September 2006, Terminotix acquired SynchroTerm and I joined the team where I occupied the Sales Director position.
In April 2010, I acquired the company through a management buyout process and since then, I am the president of Terminotix. During the last two years, I have conducted research on generating automatically high quality alignments in any language pair in order to feed Neural Machine Translation engines, bilingual concordancers.
Early in 2020 and during the pandemic, I developed a bilingual website search engine called WeBiSearch available for free and for linguistic services. In Spring 2021, I released the Terminotix plugin for Windows allowing to translate from inside any application.