Federico Gaspari


2024

pdf bib
Proceedings of the Second International Workshop Towards Digital Language Equality (TDLE): Focusing on Sustainability @ LREC-COLING 2024
Federico Gaspari | Joss Moorkens | Itziar Aldabe | Aritz Farwell | Begona Altuna | Stelios Piperidis | Georg Rehm | German Rigau
Proceedings of the Second International Workshop Towards Digital Language Equality (TDLE): Focusing on Sustainability @ LREC-COLING 2024

pdf bib
Surveying the Technology Support of Languages
Annika Grützner-Zahn | Federico Gaspari | Maria Giagkou | Stefanie Hegele | Andy Way | Georg Rehm
Proceedings of the Second International Workshop Towards Digital Language Equality (TDLE): Focusing on Sustainability @ LREC-COLING 2024

Many of the world’s languages are left behind when it comes to Language Technology applications, since most of these are available only in a limited number of languages, creating a digital divide that affects millions of users worldwide. It is crucial, therefore, to monitor and quantify the progress of technology support for individual languages, which also enables comparisons across language communities. In this way, efforts can be directed towards reducing language barriers, promoting economic and social inclusion, and ensuring that all citizens can use their preferred language in the digital age. This paper critically reviews and compares recent quantitative approaches to measuring technology support for languages. Despite using different approaches and methodologies, the findings of all analysed papers demonstrate the unequal distribution of technology support and emphasise the existence of a digital divide among languages.

2022

pdf bib
Introducing the Digital Language Equality Metric: Technological Factors
Federico Gaspari | Owen Gallagher | Georg Rehm | Maria Giagkou | Stelios Piperidis | Jane Dunne | Andy Way
Proceedings of the Workshop Towards Digital Language Equality within the 13th Language Resources and Evaluation Conference

This paper introduces the concept of Digital Language Equality (DLE) developed by the EU-funded European Language Equality (ELE) project, and describes the associated DLE Metric with a focus on its technological factors (TFs), which are complemented by situational contextual factors. This work aims at objectively describing the level of technological support of all European languages and lays the foundation to implement a large-scale EU-wide programme to ensure that these languages can continue to exist and prosper in the digital age, to serve the present and future needs of their speakers. The paper situates this ongoing work with a strong European focus in the broader context of related efforts, and explains how the DLE Metric can help track the progress towards DLE for all languages of Europe, focusing in particular on the role played by the TFs. These are derived from the European Language Grid (ELG) Catalogue, that provides the empirical basis to measure the level of digital readiness of all European languages. The DLE Metric scores can be consulted through an online interactive dashboard to show the level of technological support of each European language and track the overall progress toward DLE.

pdf bib
Achievements of the PRINCIPLE Project: Promoting MT for Croatian, Icelandic, Irish and Norwegian
Petra Bago | Sheila Castilho | Jane Dunne | Federico Gaspari | Andre K | Gauti Kristmannsson | Jon Arild Olsen | Natalia Resende | Níels Rúnar Gíslason | Dana D. Sheridan | Páraic Sheridan | John Tinsley | Andy Way
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

This paper provides an overview of the main achievements of the completed PRINCIPLE project, a 2-year action funded by the European Commission under the Connecting Europe Facility (CEF) programme. PRINCIPLE focused on collecting high-quality language resources for Croatian, Icelandic, Irish and Norwegian, which are severely low-resource languages, especially for building effective machine translation (MT) systems. We report the achievements of the project, primarily, in terms of the large amounts of data collected for all four low-resource languages and of promoting the uptake of neural MT (NMT) for these languages.

pdf bib
Overview of the ELE Project
Itziar Aldabe | Jane Dunne | Aritz Farwell | Owen Gallagher | Federico Gaspari | Maria Giagkou | Jan Hajic | Jens Peter Kückens | Teresa Lynn | Georg Rehm | German Rigau | Katrin Marheinecke | Stelios Piperidis | Natalia Resende | Tea Vojtěchová | Andy Way
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

This paper provides an overview of the ongoing European Language Equality(ELE) project, an 18-month action funded by the European Commission which involves 52 partners. The primary goal of ELE is to prepare the European Language Equality Programme, in the form of a strategic research, innovation and implementation agenda and a roadmap for achieving full digital language equality (DLE) in Europe by 2030.

2021

bib
Building MT systems in low resourced languages for Public Sector users in Croatia, Iceland, Ireland, and Norway
Róisín Moran | Carla Para Escartín | Akshai Ramesh | Páraic Sheridan | Jane Dunne | Federico Gaspari | Sheila Castilho | Natalia Resende | Andy Way
Proceedings of Machine Translation Summit XVIII: Users and Providers Track

When developing Machine Translation engines, low resourced language pairs tend to be in a disadvantaged position: less available data means that developing robust MT models can be more challenging. The EU-funded PRINCIPLE project aims at overcoming this challenge for four low resourced European languages: Norwegian, Croatian, Irish and Icelandic. This presentation will give an overview of the project, with a focus on the set of Public Sector users and their use cases for which we have developed MT solutions. We will discuss the range of language resources that have been gathered through contributions from public sector collaborators, and present the extensive evaluations that have been undertaken, including significant user evaluation of MT systems across all of the public sector participants in each of the four countries involved.

2020

pdf bib
ELRI: A Decentralised Network of National Relay Stations to Collect, Prepare and Share Language Resources
Thierry Etchegoyhen | Borja Anza Porras | Andoni Azpeitia | Eva Martínez Garcia | José Luis Fonseca | Patricia Fonseca | Paulo Vale | Jane Dunne | Federico Gaspari | Teresa Lynn | Helen McHugh | Andy Way | Victoria Arranz | Khalid Choukri | Hervé Pusset | Alexandre Sicard | Rui Neto | Maite Melero | David Perez | António Branco | Ruben Branco | Luís Gomes
Proceedings of the 1st International Workshop on Language Technology Platforms

We describe the European Language Resource Infrastructure (ELRI), a decentralised network to help collect, prepare and share language resources. The infrastructure was developed within a project co-funded by the Connecting Europe Facility Programme of the European Union, and has been deployed in the four Member States participating in the project, namely France, Ireland, Portugal and Spain. ELRI provides sustainable and flexible means to collect and share language resources via National Relay Stations, to which members of public institutions can freely subscribe. The infrastructure includes fully automated data processing engines to facilitate the preparation, sharing and wider reuse of useful language resources that can help optimise human and automated translation services in the European Union.

pdf bib
Progress of the PRINCIPLE Project: Promoting MT for Croatian, Icelandic, Irish and Norwegian
Andy Way | Petra Bago | Jane Dunne | Federico Gaspari | Andre Kåsen | Gauti Kristmannsson | Helen McHugh | Jon Arild Olsen | Dana Davis Sheridan | Páraic Sheridan | John Tinsley
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

This paper updates the progress made on the PRINCIPLE project, a 2-year action funded by the European Commission under the Connecting Europe Facility (CEF) programme. PRINCIPLE focuses on collecting high-quality language resources for Croatian, Icelandic, Irish and Norwegian, which have been identified as low-resource languages, especially for building effective machine translation (MT) systems. We report initial achievements of the project and ongoing activities aimed at promoting the uptake of neural MT for the low-resource languages of the project.

2019

pdf bib
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks
Mikel Forcada | Andy Way | John Tinsley | Dimitar Shterionov | Celia Rico | Federico Gaspari
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

pdf bib
PRINCIPLE: Providing Resources in Irish, Norwegian, Croatian and Icelandic for the Purposes of Language Engineering
Andy Way | Federico Gaspari
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

pdf bib
Large-scale Machine Translation Evaluation of the iADAATPA Project
Sheila Castilho | Natália Resende | Federico Gaspari | Andy Way | Tony O’Dowd | Marek Mazur | Manuel Herranz | Alex Helle | Gema Ramírez-Sánchez | Víctor Sánchez-Cartagena | Mārcis Pinnis | Valters Šics
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

2018

pdf bib
ELRI - European Language Resources Infrastructure
Thierry Etchegoyhen | Borja Anza Porras | Andoni Azpeitia | Eva Martínez Garcia | Paulo Vale | José Luis Fonseca | Teresa Lynn | Jane Dunne | Federico Gaspari | Andy Way | Victoria Arranz | Khalid Choukri | Vladimir Popescu | Pedro Neiva | Rui Neto | Maite Melero | David Perez Fernandez | Antonio Branco | Ruben Branco | Luis Gomes
Proceedings of the 21st Annual Conference of the European Association for Machine Translation

We describe the European Language Resources Infrastructure project, whose main aim is the provision of an infrastructure to help collect, prepare and share language resources that can in turn improve translation services in Europe.

pdf bib
Improving Machine Translation of Educational Content via Crowdsourcing
Maximiliana Behnke | Antonio Valerio Miceli Barone | Rico Sennrich | Vilelmini Sosoni | Thanasis Naskos | Eirini Takoulidou | Maria Stasimioti | Menno van Zaanen | Sheila Castilho | Federico Gaspari | Panayota Georgakopoulou | Valia Kordoni | Markus Egg | Katia Lida Kermanidis
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Enhancing Machine Translation of Academic Course Catalogues with Terminological Resources
Randy Scansani | Silvia Bernardini | Adriano Ferraresi | Federico Gaspari | Marcello Soffritti
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology

This paper describes an approach to translating course unit descriptions from Italian and German into English, using a phrase-based machine translation (MT) system. The genre is very prominent among those requiring translation by universities in European countries in which English is a non-native language. For each language combination, an in-domain bilingual corpus including course unit and degree program descriptions is used to train an MT engine, whose output is then compared to a baseline engine trained on the Europarl corpus. In a subsequent experiment, a bilingual terminology database is added to the training sets in both engines and its impact on the output quality is evaluated based on BLEU and post-editing score. Results suggest that the use of domain-specific corpora boosts the engines quality for both language combinations, especially for German-English, whereas adding terminological resources does not seem to bring notable benefits.

pdf bib
A Comparative Quality Evaluation of PBSMT and NMT using Professional Translators
Sheila Castilho | Joss Moorkens | Federico Gaspari | Rico Sennrich | Vilelmini Sosoni | Panayota Georgakopoulou | Pintu Lohar | Andy Way | Antonio Valerio Miceli-Barone | Maria Gialama
Proceedings of Machine Translation Summit XVI: Research Track

2016

pdf bib
TraMOOC (Translation for Massive Open Online Courses): providing reliable MT for MOOCs
Valia Kordoni | Lexi Birch | Ioana Buliga | Kostadin Cholakov | Markus Egg | Federico Gaspari | Yota Georgakopolou | Maria Gialama | Iris Hendrickx | Mitja Jermol | Katia Kermanidis | Joss Moorkens | Davor Orlic | Michael Papadopoulos | Maja Popović | Rico Sennrich | Vilelmini Sosoni | Dimitrios Tsoumakos | Antal van den Bosch | Menno van Zaanen | Andy Way
Proceedings of the 19th Annual Conference of the European Association for Machine Translation: Projects/Products

pdf bib
Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities
Meritxell Fernández Barrera | Vladimir Popescu | Antonio Toral | Federico Gaspari | Khalid Choukri
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper discusses the role that statistical machine translation (SMT) can play in the development of cross-border EU e-commerce,by highlighting extant obstacles and identifying relevant technologies to overcome them. In this sense, it firstly proposes a typology of e-commerce static and dynamic textual genres and it identifies those that may be more successfully targeted by SMT. The specific challenges concerning the automatic translation of user-generated content are discussed in detail. Secondly, the paper highlights the risk of data sparsity inherent to e-commerce and it explores the state-of-the-art strategies to achieve domain adequacy via adaptation. Thirdly, it proposes a robust workflow for the development of SMT systems adapted to the e-commerce domain by relying on inexpensive methods. Given the scarcity of user-generated language corpora for most language pairs, the paper proposes to obtain monolingual target-language data to train language models and aligned parallel corpora to tune and evaluate MT systems by means of crowdsourcing.

2014

pdf bib
Perception vs. reality: measuring machine translation post-editing productivity
Federico Gaspari | Antonio Toral | Sudip Kumar Naskar | Declan Groves | Andy Way
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas

This paper presents a study of user-perceived vs real machine translation (MT) post-editing effort and productivity gains, focusing on two bidirectional language pairs: English—German and English—Dutch. Twenty experienced media professionals post-edited statistical MT output and also manually translated comparative texts within a production environment. The paper compares the actual post-editing time against the users’ perception of the effort and time required to post-edit the MT output to achieve publishable quality, thus measuring real (vs perceived) productivity gains. Although for all the language pairs users perceived MT post-editing to be slower, in fact it proved to be a faster option than manual translation for two translation directions out of four, i.e. for Dutch to English, and (marginally) for English to German. For further objective scrutiny, the paper also checks the correlation of three state-of-the-art automatic MT evaluation metrics (BLEU, METEOR and TER) with the actual post-editing time.

2013

pdf bib
A Web Application for the Diagnostic Evaluation of Machine Translation over Specific Linguistic Phenomena
Antonio Toral | Sudip Kumar Naskar | Joris Vreeke | Federico Gaspari | Declan Groves
Proceedings of the 2013 NAACL HLT Demonstration Session

pdf bib
Meta-Evaluation of a Diagnostic Quality Metric for Machine Translation
Sudip Kumar Naskar | Antonio Toral | Federico Gaspari | Declan Groves
Proceedings of Machine Translation Summit XIV: Papers

2011

pdf bib
A Comparative Evaluation of Research vs. Online MT Systems
Antonio Toral | Federico Gaspari | Sudip Kumar Naskar | Andy Way
Proceedings of the 15th Annual Conference of the European Association for Machine Translation

pdf bib
A Framework for Diagnostic Evaluation of MT Based on Linguistic Checkpoints
Sudip Kumar Naskar | Antonio Toral | Federico Gaspari | Andy Way
Proceedings of Machine Translation Summit XIII: Papers

2007

pdf bib
Online and free! Ten years of online machine translation: origins, developments, current use and future prospects
Federico Gaspari | John Hutchins
Proceedings of Machine Translation Summit XI: Papers

pdf bib
Using free online MT in multilingual websites
Federico Gaspari | Harold Somers
Proceedings of Machine Translation Summit XI: Tutorials

pdf bib
Making a sow’s ear out of a silk purse: (mis)using online MT services as bilingual dictionaries
Federico Gaspari | Harold Somers
Proceedings of Translating and the Computer 29

2006

pdf bib
Detecting Inappropriate Use of Free Online Machine Translation by Language Students. A Special Case of Plagiarism Detection
Harold Somers | Federico Gaspari | Ana Niño
Proceedings of the 11th Annual Conference of the European Association for Machine Translation

pdf bib
Look Who’s Translating. Impersonations, Chinese Whispers and Fun with Machine Translation on the Internet
Federico Gaspari
Proceedings of the 11th Annual Conference of the European Association for Machine Translation

pdf bib
The Added Value of Free Online MT Services
Federico Gaspari
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper reports on an experiment investigating how effective free online machine translation (MT) is in helping Internet users to access the contents of websites written only in languages they do not know. This study explores the extent to which using Internet-based MT tools affects the confidence of web-surfers in the reliability of the information they find on websites available only in languages unfamiliar to them. The results of a case study for the language pair Italian-English involving 101 participants show that the chances of identifying correctly basic information (i.e. understanding the nature of websites and finding contact telephone numbers from their web-pages) are consistently enhanced to varying degrees (up to nearly 20%) by translating online content into a familiar language. In addition, confidence ratings given by users to the reliability and accuracy of the information they find are significantly higher (with increases between 5 and 11%) when they translate websites into their preferred language with free online MT services.

pdf bib
The Added Value of Free Online MT Services
Federico Gaspari
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: User Track Presentations

pdf bib
The social impact of online MT
Federico Gaspari
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Panel on machine translation for social impact

2005

pdf bib
Embedding Free Online Machine Translation into Monolingual Websites for Multilingual Dissemination: a Case Study of Implementation
Federico Gaspari
Translating and the Computer 27

2004

pdf bib
Online MT services and real users’ needs: an empirical usability evaluation
Federico Gaspari
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper presents an empirical evaluation of the main usability factors that play a significant role in the interaction with on-line Machine Translation (MT) services. The investigation is carried out from the point of view of typical users with an emphasis on their real needs, and focuses on a set of key usability criteria that have an impact on the successful deployment of Internet-based MT technology. A small-scale evaluation of the performance of five popular web-based MT systems against the selected usability criteria shows that different approaches to interaction design can dramatically affect the level of user satisfaction. There are strong indications that the results of this study can be fed back into the development of on-line MT services to enhance their design, thus ensuring that they meet the requirements and expectations of a wide range of Internet users.

pdf bib
Integrating on-line MT services into monolingual web-sites for dissemination purposes: an evaluation perspective
Federico Gaspari
Proceedings of the 9th EAMT Workshop: Broadening horizons of machine translation and its applications

2002

pdf bib
Using free on-line services in MT training
Federico Gaspari
Proceedings of the 6th EAMT Workshop: Teaching Machine Translation

2001

pdf bib
Teaching machine translation to trainee translators: a survey of their knowledge and opinions
Federico Gaspari
Workshop on Teaching Machine Translation

This paper reports upon a survey carried out among thirty-eight trainee translators who took courses on machine translation. The survey was conducted asking the sample of students to fill out a questionnaire both at the beginning and at the end of the MT course. The questions aimed at assessing the degree of knowledge about MT of the respondents and the opinions and impressions that they accordingly had on it. The results of the questionnaire were elaborated so as to investigate the relationship between the increase in the knowledge about MT after the conclusion of the course, and the corresponding change in the students’ attitude towards the discipline, which became much less biased and in general fairly positive, thanks to a very successful and rewarding learning process. The paper suggests that the more the trainee translators became familiar with MT, realising its reasonable potential and current limitations, the less afraid they were of it. These findings encourage the increasing integration and introduction of technology into translation curricula, since the impact of computer technology on language translation directly affects professional human translators. As a result, exposing trainee translators to machine translation seems to raise the profile of their training.
Search
Co-authors
Venues