NLPRL System for Very Low Resource Supervised Machine Translation

Rupjyoti Baruah, Rajesh Kumar Mundotiya, Amit Kumar, Anil kumar Singh


Abstract
This paper describes the results of the system that we used for the WMT20 very low resource (VLR) supervised MT shared task. For our experiments, we use a byte-level version of BPE, which requires a base vocabulary of size 256 only. BPE based models are a kind of sub-word models. Such models try to address the Out of Vocabulary (OOV) word problem by performing word segmentation so that segments correspond to morphological units. They are also reported to work across different languages, especially similar languages due to their sub-word nature. Based on BLEU cased score, our NLPRL systems ranked ninth for HSB to GER and tenth in GER to HSB translation scenario.
Anthology ID:
2020.wmt-1.126
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Editors:
Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1075–1078
Language:
URL:
https://aclanthology.org/2020.wmt-1.126
DOI:
Bibkey:
Cite (ACL):
Rupjyoti Baruah, Rajesh Kumar Mundotiya, Amit Kumar, and Anil kumar Singh. 2020. NLPRL System for Very Low Resource Supervised Machine Translation. In Proceedings of the Fifth Conference on Machine Translation, pages 1075–1078, Online. Association for Computational Linguistics.
Cite (Informal):
NLPRL System for Very Low Resource Supervised Machine Translation (Baruah et al., WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.126.pdf
Video:
 https://slideslive.com/38939625