Europarl Datasets with Demographic Speaker Information

Eva Vanmassenhove, Christian Hardmeier


Abstract
Research on speaker-adapted neural machine translation (NMT) is scarce. One of the main challenges for more personalized MT systems is finding large enough annotated parallel datasets with speaker information. Rabinovich et al. (2017) published an annotated parallel dataset for EN–FR and EN–DE, however, for many other language pairs no sufficiently large annotated datasets are available.
Anthology ID:
2018.eamt-main.59
Volume:
Proceedings of the 21st Annual Conference of the European Association for Machine Translation
Month:
May
Year:
2018
Address:
Alicante, Spain
Editors:
Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Miquel Esplà-Gomis, Maja Popović, Celia Rico, André Martins, Joachim Van den Bogaert, Mikel L. Forcada
Venue:
EAMT
SIG:
Publisher:
Note:
Pages:
391
Language:
URL:
https://aclanthology.org/2018.eamt-main.59
DOI:
Bibkey:
Cite (ACL):
Eva Vanmassenhove and Christian Hardmeier. 2018. Europarl Datasets with Demographic Speaker Information. In Proceedings of the 21st Annual Conference of the European Association for Machine Translation, page 391, Alicante, Spain.
Cite (Informal):
Europarl Datasets with Demographic Speaker Information (Vanmassenhove & Hardmeier, EAMT 2018)
Copy Citation:
PDF:
https://aclanthology.org/2018.eamt-main.59.pdf