Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation

Youngwoo Kim, Razieh Rahimi, James Allan


Abstract
Most of the efforts in interpreting neural relevance models have been on local explanations, which explain the relevance of a document to a query. However, local explanations are not effective in predicting the model’s behavior on unseen texts. We aim at explaining a neural relevance model by providing lexical explanations that can be globally generalized. Specifically, we construct a relevance thesaurus containing semantically relevant query term and document term pairs, which can augment BM25 scoring functions to better approximate the neural model’s predictions. We propose a novel method to build a relevance thesaurus construction. Our method involves training a neural relevance model which can score the relevance for partial segments of query and documents. The trained model is used to identify relevant terms over the vocabulary space. The resulting thesaurus explanation is evaluated based on ranking effectiveness and fidelity to the targeted neural ranking model. Finally, our thesaurus reveals the existence of brand name bias in ranking models, which further supports the utility of our explanation method.
Anthology ID:
2024.emnlp-main.1089
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
19530–19547
Language:
URL:
https://aclanthology.org/2024.emnlp-main.1089/
DOI:
10.18653/v1/2024.emnlp-main.1089
Bibkey:
Cite (ACL):
Youngwoo Kim, Razieh Rahimi, and James Allan. 2024. Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 19530–19547, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation (Kim et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.1089.pdf