2022
pdf
bib
abs
Achievements of the PRINCIPLE Project: Promoting MT for Croatian, Icelandic, Irish and Norwegian
Petra Bago
|
Sheila Castilho
|
Jane Dunne
|
Federico Gaspari
|
Andre K
|
Gauti Kristmannsson
|
Jon Arild Olsen
|
Natalia Resende
|
Níels Rúnar Gíslason
|
Dana D. Sheridan
|
Páraic Sheridan
|
John Tinsley
|
Andy Way
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
This paper provides an overview of the main achievements of the completed PRINCIPLE project, a 2-year action funded by the European Commission under the Connecting Europe Facility (CEF) programme. PRINCIPLE focused on collecting high-quality language resources for Croatian, Icelandic, Irish and Norwegian, which are severely low-resource languages, especially for building effective machine translation (MT) systems. We report the achievements of the project, primarily, in terms of the large amounts of data collected for all four low-resource languages and of promoting the uptake of neural MT (NMT) for these languages.
2020
pdf
bib
abs
Progress of the PRINCIPLE Project: Promoting MT for Croatian, Icelandic, Irish and Norwegian
Andy Way
|
Petra Bago
|
Jane Dunne
|
Federico Gaspari
|
Andre Kåsen
|
Gauti Kristmannsson
|
Helen McHugh
|
Jon Arild Olsen
|
Dana Davis Sheridan
|
Páraic Sheridan
|
John Tinsley
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
This paper updates the progress made on the PRINCIPLE project, a 2-year action funded by the European Commission under the Connecting Europe Facility (CEF) programme. PRINCIPLE focuses on collecting high-quality language resources for Croatian, Icelandic, Irish and Norwegian, which have been identified as low-resource languages, especially for building effective machine translation (MT) systems. We report initial achievements of the project and ongoing activities aimed at promoting the uptake of neural MT for the low-resource languages of the project.
2019
pdf
bib
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks
Mikel Forcada
|
Andy Way
|
John Tinsley
|
Dimitar Shterionov
|
Celia Rico
|
Federico Gaspari
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks
pdf
bib
Improving Robustness in Real-World Neural Machine Translation Engines
Rohit Gupta
|
Patrik Lambert
|
Raj Patel
|
John Tinsley
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks
2016
bib
What? Why? How? - Factors that impact the success of commercial MT projects
John Tinsley
Conferences of the Association for Machine Translation in the Americas: MT Users' Track
2015
pdf
bib
Improving translator productivity with MT: a patent translation case study
John Tinsley
Proceedings of the 6th Workshop on Patent and Scientific Literature Translation
2014
bib
From the lab to the market: commercialising MT research
John Tinsley
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Users Track
2013
pdf
bib
Exploiting Parallel Corpus for Handling Out-of-Vocabulary Words
Juan Luo
|
John Tinsley
|
Yves Lepage
Proceedings of the 27th Pacific Asia Conference on Language, Information, and Computation (PACLIC 27)
2012
pdf
bib
PLUTO: Automated Solutions for Patent Translation
John Tinsley
|
Alexandru Ceausu
|
Jian Zhang
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
pdf
bib
abs
IPTranslator: Facilitating Patent Search with Machine Translation
John Tinsley
|
Alexandru Ceausu
|
Jian Zhang
|
Heidi Depraetere
|
Joeri Van de Walle
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Commercial MT User Program
Intellectual Property professionals frequently need to carry out patent searches for a variety of reasons. During a typical search, they will retrieve approximately 30% of their results in a foreign language. The machine translation (MT) options currently available to patent searchers for these foreign-language patents vary in their quality, consistency, and general level of service. In this article, we introduce IPTranslator; an MT web service designed to cater for the needs of patent searchers. At the core of IPTranslator is a set of MT systems developed specifically for translating patent text. We describe the challenges faced in adapting MT technology to such a complex domain, and how the systems were evaluated to ensure that the quality was fit for purpose. Finally, we present the framework through which the IPTranslator service is delivered to users, and the value-adding features which address many of the issues with existing solutions.
2011
pdf
bib
Experiments on Domain Adaptation for Patent Machine Translation in the PLuTO project
Alexandru Ceauşu
|
John Tinsley
|
Jian Zhang
|
Andy Way
Proceedings of the 15th Annual Conference of the European Association for Machine Translation
2010
pdf
bib
abs
PLuTO: MT for On-Line Patent Translation
John Tinsley
|
Andy Way
|
Páraic Sheridan
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Commercial MT User Program
PLuTO – Patent Language Translation Online – is a partially EU-funded commercialization project which specializes in the automatic retrieval and translation of patent documents. At the core of the PLuTO framework is a machine translation (MT) engine through which web-based translation services are offered. The fully integrated PLuTO architecture includes a translation engine coupling MT with translation memories (TM), and a patent search and retrieval engine. In this paper, we first describe the motivating factors behind the provision of such a service. Following this, we give an overview of the PLuTO framework as a whole, with particular emphasis on the MT components, and provide a real world use case scenario in which PLuTO MT services are ex- ploited.
2008
pdf
bib
abs
Comparing Constituency and Dependency Representations for SMT Phrase-Extraction
Mary Hearne
|
Sylwia Ozdowska
|
John Tinsley
Actes de la 15ème conférence sur le Traitement Automatique des Langues Naturelles. Articles courts
We consider the value of replacing and/or combining string-basedmethods with syntax-based methods for phrase-based statistical machine translation (PBSMT), and we also consider the relative merits of using constituency-annotated vs. dependency-annotated training data. We automatically derive two subtree-aligned treebanks, dependency-based and constituency-based, from a parallel English–French corpus and extract syntactically motivated word- and phrase-pairs. We automatically measure PB-SMT quality. The results show that combining string-based and syntax-based word- and phrase-pairs can improve translation quality irrespective of the type of syntactic annotation. Furthermore, using dependency annotation yields greater translation quality than constituency annotation for PB-SMT.
pdf
bib
abs
Exploiting alignment techniques in MATREX: the DCU machine translation system for IWSLT 2008.
Yanjun Ma
|
John Tinsley
|
Hany Hassan
|
Jinhua Du
|
Andy Way
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign
In this paper, we give a description of the machine translation (MT) system developed at DCU that was used for our third participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2008). In this participation, we focus on various techniques for word and phrase alignment to improve system quality. Specifically, we try out our word packing and syntax-enhanced word alignment techniques for the Chinese–English task and for the English–Chinese task for the first time. For all translation tasks except Arabic–English, we exploit linguistically motivated bilingual phrase pairs extracted from parallel treebanks. We smooth our translation tables with out-of-domain word translations for the Arabic–English and Chinese–English tasks in order to solve the problem of the high number of out of vocabulary items. We also carried out experiments combining both in-domain and out-of-domain data to improve system performance and, finally, we deploy a majority voting procedure combining a language model-based method and a translation-based method for case and punctuation restoration. We participated in all the translation tasks and translated both the single-best ASR hypotheses and the correct recognition results. The translation results confirm that our new word and phrase alignment techniques are often helpful in improving translation quality, and the data combination method we proposed can significantly improve system performance.
pdf
bib
MaTrEx: The DCU MT System for WMT 2008
John Tinsley
|
Yanjun Ma
|
Sylwia Ozdowska
|
Andy Way
Proceedings of the Third Workshop on Statistical Machine Translation
2007
pdf
bib
Robust language pair-independent sub-tree alignment
John Tinsley
|
Ventsislav Zhechev
|
Mary Hearne
|
Andy Way
Proceedings of Machine Translation Summit XI: Papers
pdf
bib
Capturing translational divergences with a statistical tree-to-tree aligner
Mary Hearne
|
John Tinsley
|
Ventsislav Zhechev
|
Andy Way
Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers