Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT

Elena Voita, Rico Sennrich, Ivan Titov


Abstract
Differently from the traditional statistical MT that decomposes the translation task into distinct separately learned components, neural machine translation uses a single neural network to model the entire translation process. Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training, and how this mirrors the different models in traditional SMT. In this work, we look at the competences related to three core SMT components and find that during training, NMT first focuses on learning target-side language modeling, then improves translation quality approaching word-by-word translation, and finally learns more complicated reordering patterns. We show that this behavior holds for several models and language pairs. Additionally, we explain how such an understanding of the training process can be useful in practice and, as an example, show how it can be used to improve vanilla non-autoregressive neural machine translation by guiding teacher model selection.
Anthology ID:
2021.emnlp-main.667
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8478–8491
Language:
URL:
https://aclanthology.org/2021.emnlp-main.667
DOI:
10.18653/v1/2021.emnlp-main.667
Bibkey:
Cite (ACL):
Elena Voita, Rico Sennrich, and Ivan Titov. 2021. Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8478–8491, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT (Voita et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.667.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.667.mp4