On the Effectiveness of Quasi Character-Level Models for Machine Translation

Salvador Carrión, Francisco Casacuberta


Abstract
Neural Machine Translation (NMT) models often use subword-level vocabularies to deal with rare or unknown words. Although some studies have shown the effectiveness of purely character-based models, these approaches have resulted in highly expensive models in computational terms. In this work, we explore the benefits of quasi-character-level models for very low-resource languages and their ability to mitigate the effects of the catastrophic forgetting problem. First, we conduct an empirical study on the efficacy of these models, as a function of the vocabulary and training set size, for a range of languages, domains, and architectures. Next, we study the ability of these models to mitigate the effects of catastrophic forgetting in machine translation. Our work suggests that quasi-character-level models have practically the same generalization capabilities as character-based models but at lower computational costs. Furthermore, they appear to help achieve greater consistency between domains than standard subword-level models, although the catastrophic forgetting problem is not mitigated.
Anthology ID:
2022.amta-research.10
Volume:
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Month:
September
Year:
2022
Address:
Orlando, USA
Editors:
Kevin Duh, Francisco Guzmán
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
131–143
Language:
URL:
https://aclanthology.org/2022.amta-research.10
DOI:
Bibkey:
Cite (ACL):
Salvador Carrión and Francisco Casacuberta. 2022. On the Effectiveness of Quasi Character-Level Models for Machine Translation. In Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 131–143, Orlando, USA. Association for Machine Translation in the Americas.
Cite (Informal):
On the Effectiveness of Quasi Character-Level Models for Machine Translation (Carrión & Casacuberta, AMTA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.amta-research.10.pdf