Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?

Alina Karakanta, Matteo Negri, Marco Turchi


Abstract
Subtitling is becoming increasingly important for disseminating information, given the enormous amounts of audiovisual content becoming available daily. Although Neural Machine Translation (NMT) can speed up the process of translating audiovisual content, large manual effort is still required for transcribing the source language, and for spotting and segmenting the text into proper subtitles. Creating proper subtitles in terms of timing and segmentation highly depends on information present in the audio (utterance duration, natural pauses). In this work, we explore two methods for applying Speech Translation (ST) to subtitling, a) a direct end-to-end and b) a classical cascade approach. We discuss the benefit of having access to the source language speech for improving the conformity of the generated subtitles to the spatial and temporal subtitling constraints and show that length is not the answer to everything in the case of subtitling-oriented ST.
Anthology ID:
2020.iwslt-1.26
Volume:
Proceedings of the 17th International Conference on Spoken Language Translation
Month:
July
Year:
2020
Address:
Online
Editors:
Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, Francois Yvon
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
209–219
Language:
URL:
https://aclanthology.org/2020.iwslt-1.26
DOI:
10.18653/v1/2020.iwslt-1.26
Bibkey:
Cite (ACL):
Alina Karakanta, Matteo Negri, and Marco Turchi. 2020. Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?. In Proceedings of the 17th International Conference on Spoken Language Translation, pages 209–219, Online. Association for Computational Linguistics.
Cite (Informal):
Is 42 the Answer to Everything in Subtitling-oriented Speech Translation? (Karakanta et al., IWSLT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.iwslt-1.26.pdf
Video:
 http://slideslive.com/38929602
Data
MuST-Cinema