MCLF: A Multi-grained Contrastive Learning Framework for ASR-robust Spoken Language Understanding

Zhiqi Huang, Dongsheng Chen, Zhihong Zhu, Xuxin Cheng


Abstract
Enhancing the robustness towards Automatic Speech Recognition (ASR) errors is of great importance for Spoken Language Understanding (SLU). Trending ASR-robust SLU systems have witnessed impressive improvements through global contrastive learning. However, although most ASR errors occur only at local positions of utterances, they can easily lead to severe semantic changes, and utterance-level classification or comparison is difficult to distinguish such differences. To address the problem, we propose a two-stage multi-grained contrastive learning framework dubbed MCLF. Technically, we first adapt the pre-trained language models to downstream SLU datasets via the proposed multi-grained contrastive learning objective and then fine-tune it on the corresponding dataset. Besides, to facilitate contrastive learning in the pre-training stage, we explore several data augmentation methods to expand the training data. Experimental results and detailed analyses on four datasets and four BERT-like backbone models demonstrate the effectiveness of our approach.
Anthology ID:
2023.findings-emnlp.533
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7936–7949
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.533
DOI:
10.18653/v1/2023.findings-emnlp.533
Bibkey:
Cite (ACL):
Zhiqi Huang, Dongsheng Chen, Zhihong Zhu, and Xuxin Cheng. 2023. MCLF: A Multi-grained Contrastive Learning Framework for ASR-robust Spoken Language Understanding. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 7936–7949, Singapore. Association for Computational Linguistics.
Cite (Informal):
MCLF: A Multi-grained Contrastive Learning Framework for ASR-robust Spoken Language Understanding (Huang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.533.pdf