Measuring and Improving Consistency in Pretrained Language Models

Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, Yoav Goldberg


Abstract
Consistency of a model—that is, the invariance of its behavior under meaning-preserving alternations in its input—is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel🤘, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for 38 relations. Using ParaRel🤘, we show that the consistency of all PLMs we experiment with is poor— though with high variance between relations. Our analysis of the representational spaces of PLMs suggests that they have a poor structure and are currently not suitable for representing knowledge robustly. Finally, we propose a method for improving model consistency and experimentally demonstrate its effectiveness.1
Anthology ID:
2021.tacl-1.60
Volume:
Transactions of the Association for Computational Linguistics, Volume 9
Month:
Year:
2021
Address:
Cambridge, MA
Editors:
Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
1012–1031
Language:
URL:
https://aclanthology.org/2021.tacl-1.60
DOI:
10.1162/tacl_a_00410
Bibkey:
Cite (ACL):
Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, and Yoav Goldberg. 2021. Measuring and Improving Consistency in Pretrained Language Models. Transactions of the Association for Computational Linguistics, 9:1012–1031.
Cite (Informal):
Measuring and Improving Consistency in Pretrained Language Models (Elazar et al., TACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.tacl-1.60.pdf
Video:
 https://aclanthology.org/2021.tacl-1.60.mp4