Susie Coleman
2020
Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media
Susie Coleman
|
Andrew Secker
|
Rachel Bawden
|
Barry Haddow
|
Alexandra Birch
Proceedings of the 1st International Workshop on Language Technology Platforms
This paper presents an example architecture for a scalable, secure and resilient Machine Translation (MT) platform, using components available via Amazon Web Services (AWS). It is increasingly common for a single news organisation to publish and monitor news sources in multiple languages. A growth in news sources makes this increasingly challenging and time-consuming but MT can help automate some aspects of this process. Building a translation service provides a single integration point for news room tools that use translation technology allowing MT models to be integrated into a system once, rather than each time the translation technology is needed. By using a range of services provided by AWS, it is possible to architect a platform where multiple pre-existing technologies are combined to build a solution, as opposed to developing software from scratch for deployment on a single virtual machine. This increases the speed at which a platform can be developed and allows the use of well-maintained services. However, a single service also provides challenges. It is key to consider how the platform will scale when handling many users and how to ensure the platform is resilient.
An English-Swahili parallel corpus and its use for neural machine translation in the news domain
Felipe Sánchez-Martínez
|
Víctor M. Sánchez-Cartagena
|
Juan Antonio Pérez-Ortiz
|
Mikel L. Forcada
|
Miquel Esplà-Gomis
|
Andrew Secker
|
Susie Coleman
|
Julie Wall
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
This paper describes our approach to create a neural machine translation system to translate between English and Swahili (both directions) in the news domain, as well as the process we followed to crawl the necessary parallel corpora from the Internet. We report the results of a pilot human evaluation performed by the news media organisations participating in the H2020 EU-funded project GoURMET.