Multilingual Unsupervised Neural Machine Translation with Denoising Adapters

Ahmet Üstün, Alexandre Berard, Laurent Besacier, Matthias Gallé


Abstract
We consider the problem of multilingual unsupervised machine translation, translating to and from languages that only have monolingual data by using auxiliary parallel language pairs. For this problem the standard procedure so far to leverage the monolingual data is _back-translation_, which is computationally costly and hard to tune. In this paper we propose instead to use _denoising adapters_, adapter layers with a denoising objective, on top of pre-trained mBART-50. In addition to the modularity and flexibility of such an approach we show that the resulting translations are on-par with back-translating as measured by BLEU, and furthermore it allows adding unseen languages incrementally.
Anthology ID:
2021.emnlp-main.533
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6650–6662
Language:
URL:
https://aclanthology.org/2021.emnlp-main.533
DOI:
10.18653/v1/2021.emnlp-main.533
Bibkey:
Cite (ACL):
Ahmet Üstün, Alexandre Berard, Laurent Besacier, and Matthias Gallé. 2021. Multilingual Unsupervised Neural Machine Translation with Denoising Adapters. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6650–6662, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Multilingual Unsupervised Neural Machine Translation with Denoising Adapters (Üstün et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.533.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.533.mp4
Data
FLoRes