The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation

Minghan Wang, Jiaxin Guo, Yinglu Li, Xiaosong Qiao, Yuxia Wang, Zongyao Li, Chang Su, Yimeng Chen, Min Zhang, Shimin Tao, Hao Yang, Ying Qin


Abstract
This paper presents our work in the participation of IWSLT 2022 simultaneous speech translation evaluation. For the track of text-to-text (T2T), we participate in three language pairs and build wait-k based simultaneous MT (SimulMT) model for the task. The model was pretrained on WMT21 news corpora, and was further improved with in-domain fine-tuning and self-training. For the speech-to-text (S2T) track, we designed both cascade and end-to-end form in three language pairs. The cascade system is composed of a chunking-based streaming ASR model and the SimulMT model used in the T2T track. The end-to-end system is a simultaneous speech translation (SimulST) model based on wait-k strategy, which is directly trained on a synthetic corpus produced by translating all texts of ASR corpora into specific target language with an offline MT model. It also contains a heuristic sentence breaking strategy, preventing it from finishing the translation before the the end of the speech. We evaluate our systems on the MUST-C tst-COMMON dataset and show that the end-to-end system is competitive to the cascade one. Meanwhile, we also demonstrate that the SimulMT model can be efficiently optimized by these approaches, resulting in the improvements of 1-2 BLEU points.
Anthology ID:
2022.iwslt-1.21
Volume:
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)
Month:
May
Year:
2022
Address:
Dublin, Ireland (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Marta Costa-jussà
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
247–254
Language:
URL:
https://aclanthology.org/2022.iwslt-1.21
DOI:
10.18653/v1/2022.iwslt-1.21
Bibkey:
Cite (ACL):
Minghan Wang, Jiaxin Guo, Yinglu Li, Xiaosong Qiao, Yuxia Wang, Zongyao Li, Chang Su, Yimeng Chen, Min Zhang, Shimin Tao, Hao Yang, and Ying Qin. 2022. The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation. In Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022), pages 247–254, Dublin, Ireland (in-person and online). Association for Computational Linguistics.
Cite (Informal):
The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation (Wang et al., IWSLT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.iwslt-1.21.pdf
Data
LibriSpeech