Towards Higher Pareto Frontier in Multilingual Machine Translation

Yichong Huang, Xiaocheng Feng, Xinwei Geng, Baohang Li, Bing Qin


Abstract
Multilingual neural machine translation has witnessed remarkable progress in recent years. However, the long-tailed distribution of multilingual corpora poses a challenge of Pareto optimization, i.e., optimizing for some languages may come at the cost of degrading the performance of others. Existing balancing training strategies are equivalent to a series of Pareto optimal solutions, which trade off on a Pareto frontierIn Pareto optimization, Pareto optimal solutions refer to solutions in which none of the objectives can be improved without sacrificing at least one of the other objectives. The set of all Pareto optimal solutions forms a Pareto frontier..In this work, we propose a new training framework, Pareto Mutual Distillation (Pareto-MD), towards pushing the Pareto frontier outwards rather than making trade-offs. Specifically, Pareto-MD collaboratively trains two Pareto optimal solutions that favor different languages and allows them to learn from the strengths of each other via knowledge distillation. Furthermore, we introduce a novel strategy to enable stronger communication between Pareto optimal solutions and broaden the applicability of our approach. Experimental results on the widely-used WMT and TED datasets show that our method significantly pushes the Pareto frontier and outperforms baselines by up to +2.46 BLEUOur code will be released upon acceptance..
Anthology ID:
2023.acl-long.211
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3802–3818
Language:
URL:
https://aclanthology.org/2023.acl-long.211
DOI:
10.18653/v1/2023.acl-long.211
Bibkey:
Cite (ACL):
Yichong Huang, Xiaocheng Feng, Xinwei Geng, Baohang Li, and Bing Qin. 2023. Towards Higher Pareto Frontier in Multilingual Machine Translation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3802–3818, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Towards Higher Pareto Frontier in Multilingual Machine Translation (Huang et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.211.pdf
Video:
 https://aclanthology.org/2023.acl-long.211.mp4