AIST AIRC Submissions to the WMT23 Shared Task

Matiss Rikters, Makoto Miwa


Abstract
This paper describes the development process of NMT systems that were submitted to the WMT 2023 General Translation task by the team of AIST AIRC. We trained constrained track models for translation between English, German, and Japanese. Before training the final models, we first filtered the parallel and monolingual data, then performed iterative back-translation as well as parallel data distillation to be used for non-autoregressive model training. We experimented with training Transformer models, Mega models, and custom non-autoregressive sequence-to-sequence models with encoder and decoder weights initialised by a multilingual BERT base. Our primary submissions contain translations from ensembles of two Mega model checkpoints and our contrastive submissions are generated by our non-autoregressive models.
Anthology ID:
2023.wmt-1.13
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
155–161
Language:
URL:
https://aclanthology.org/2023.wmt-1.13
DOI:
10.18653/v1/2023.wmt-1.13
Bibkey:
Cite (ACL):
Matiss Rikters and Makoto Miwa. 2023. AIST AIRC Submissions to the WMT23 Shared Task. In Proceedings of the Eighth Conference on Machine Translation, pages 155–161, Singapore. Association for Computational Linguistics.
Cite (Informal):
AIST AIRC Submissions to the WMT23 Shared Task (Rikters & Miwa, WMT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.wmt-1.13.pdf