Is the Answer in the Text? Challenging ChatGPT with Evidence Retrieval from Instructive Text

Sophie Henning, Talita Anthonio, Wei Zhou, Heike Adel, Mohsen Mesgar, Annemarie Friedrich


Abstract
Generative language models have recently shown remarkable success in generating answers to questions in a given textual context. However, these answers may suffer from hallucination, wrongly cite evidence, and spread misleading information. In this work, we address this problem by employing ChatGPT, a state-of-the-art generative model, as a machine-reading system. We ask it to retrieve answers to lexically varied and open-ended questions from trustworthy instructive texts. We introduce WHERE (WikiHow Evidence REtrieval), a new high-quality evaluation benchmark of a set of WikiHow articles exhaustively annotated with evidence sentences to questions that comes with a special challenge: All questions are about the article’s topic, but not all can be answered using the provided context. We interestingly find that when using a regular question-answering prompt, ChatGPT neglects to detect the unanswerable cases. When provided with a few examples, it learns to better judge whether a text provides answer evidence or not. Alongside this important finding, our dataset defines a new benchmark for evidence retrieval in question answering, which we argue is one of the necessary next steps for making large language models more trustworthy.
Anthology ID:
2023.findings-emnlp.949
Original:
2023.findings-emnlp.949v1
Version 2:
2023.findings-emnlp.949v2
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14229–14241
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.949
DOI:
Bibkey:
Cite (ACL):
Sophie Henning, Talita Anthonio, Wei Zhou, Heike Adel, Mohsen Mesgar, and Annemarie Friedrich. 2023. Is the Answer in the Text? Challenging ChatGPT with Evidence Retrieval from Instructive Text. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 14229–14241, Singapore. Association for Computational Linguistics.
Cite (Informal):
Is the Answer in the Text? Challenging ChatGPT with Evidence Retrieval from Instructive Text (Henning et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.949.pdf