Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground

Adil Soubki, John Murzaku, Arash Yousefi Jordehi, Peter Zeng, Magdalena Markowska, Seyed Abolghasem Mirroshandel, Owen Rambow


Abstract
Evaluating the theory of mind (ToM) capabilities of language models (LMs) has recently received a great deal of attention. However, many existing benchmarks rely on synthetic data, which risks misaligning the resulting experiments with human behavior. We introduce the first ToM dataset based on naturally occurring spoken dialogs, Common-ToM, and show that LMs struggle to demonstrate ToM. We then show that integrating a simple, explicit representation of beliefs improves LM performance on Common-ToM.
Anthology ID:
2024.findings-acl.880
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14815–14823
Language:
URL:
https://aclanthology.org/2024.findings-acl.880
DOI:
10.18653/v1/2024.findings-acl.880
Bibkey:
Cite (ACL):
Adil Soubki, John Murzaku, Arash Yousefi Jordehi, Peter Zeng, Magdalena Markowska, Seyed Abolghasem Mirroshandel, and Owen Rambow. 2024. Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground. In Findings of the Association for Computational Linguistics ACL 2024, pages 14815–14823, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground (Soubki et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.880.pdf