Generating Better Items for Cognitive Assessments Using Large Language Models

Antonio Laverghetta Jr., John Licato


Abstract
Writing high-quality test questions (items) is critical to building educational measures but has traditionally also been a time-consuming process. One promising avenue for alleviating this is automated item generation, whereby methods from artificial intelligence (AI) are used to generate new items with minimal human intervention. Researchers have explored using large language models (LLMs) to generate new items with equivalent psychometric properties to human-written ones. But can LLMs generate items with improved psychometric properties, even when existing items have poor validity evidence? We investigate this using items from a natural language inference (NLI) dataset. We develop a novel prompting strategy based on selecting items with both the best and worst properties to use in the prompt and use GPT-3 to generate new NLI items. We find that the GPT-3 items show improved psychometric properties in many cases, whilst also possessing good content, convergent and discriminant validity evidence. Collectively, our results demonstrate the potential of employing LLMs to ease the item development process and suggest that the careful use of prompting may allow for iterative improvement of item quality.
Anthology ID:
2023.bea-1.34
Volume:
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Ekaterina Kochmar, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Nitin Madnani, Anaïs Tack, Victoria Yaneva, Zheng Yuan, Torsten Zesch
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
414–428
Language:
URL:
https://aclanthology.org/2023.bea-1.34
DOI:
10.18653/v1/2023.bea-1.34
Bibkey:
Cite (ACL):
Antonio Laverghetta Jr. and John Licato. 2023. Generating Better Items for Cognitive Assessments Using Large Language Models. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), pages 414–428, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Generating Better Items for Cognitive Assessments Using Large Language Models (Laverghetta Jr. & Licato, BEA 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.bea-1.34.pdf
Attachment:
 2023.bea-1.34.attachment.zip