Information-Theoretic Text Hallucination Reduction for Video-grounded Dialogue

Sunjae Yoon, Eunseop Yoon, Hee Suk Yoon, Junyeong Kim, Chang Yoo


Abstract
Video-grounded Dialogue (VGD) aims to decode an answer sentence to a question regarding a given video and dialogue context. Despite the recent success of multi-modal reasoning to generate answer sentences, existing dialogue systems still suffer from a text hallucination problem, which denotes indiscriminate text-copying from input texts without an understanding of the question. This is due to learning spurious correlations from the fact that answer sentences in the dataset usually include the words of input texts, thus the VGD system excessively relies on copying words from input texts by hoping those words to overlap with ground-truth texts. Hence, we design Text Hallucination Mitigating (THAM) framework, which incorporates Text Hallucination Regularization (THR) loss derived from the proposed information-theoretic text hallucination measurement approach. Applying THAM with current dialogue systems validates the effectiveness on VGD benchmarks (i.e., AVSD@DSTC7 and AVSD@DSTC8) and shows enhanced interpretability.
Anthology ID:
2022.emnlp-main.280
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4182–4193
Language:
URL:
https://aclanthology.org/2022.emnlp-main.280
DOI:
10.18653/v1/2022.emnlp-main.280
Bibkey:
Cite (ACL):
Sunjae Yoon, Eunseop Yoon, Hee Suk Yoon, Junyeong Kim, and Chang Yoo. 2022. Information-Theoretic Text Hallucination Reduction for Video-grounded Dialogue. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4182–4193, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Information-Theoretic Text Hallucination Reduction for Video-grounded Dialogue (Yoon et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.280.pdf