Connecting Attributions and QA Model Behavior on Realistic Counterfactuals

Xi Ye, Rohan Nair, Greg Durrett


Abstract
When a model attribution technique highlights a particular part of the input, a user might understand this highlight as making a statement about counterfactuals (Miller, 2019): if that part of the input were to change, the model’s prediction might change as well. This paper investigates how well different attribution techniques align with this assumption on realistic counterfactuals in the case of reading comprehension (RC). RC is a particularly challenging test case, as token-level attributions that have been extensively studied in other NLP tasks such as sentiment analysis are less suitable to represent the reasoning that RC models perform. We construct counterfactual sets for three different RC settings, and through heuristics that can connect attribution methods’ outputs to high-level model behavior, we can evaluate how useful different attribution methods and even different formats are for understanding counterfactuals. We find that pairwise attributions are better suited to RC than token-level attributions across these different RC settings, with our best performance coming from a modification that we propose to an existing pairwise attribution method.
Anthology ID:
2021.emnlp-main.447
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5496–5512
Language:
URL:
https://aclanthology.org/2021.emnlp-main.447
DOI:
10.18653/v1/2021.emnlp-main.447
Bibkey:
Cite (ACL):
Xi Ye, Rohan Nair, and Greg Durrett. 2021. Connecting Attributions and QA Model Behavior on Realistic Counterfactuals. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5496–5512, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Connecting Attributions and QA Model Behavior on Realistic Counterfactuals (Ye et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.447.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.447.mp4
Code
 xiye17/EvalQAExpl
Data
HotpotQASQuAD