Causal Matching with Text Embeddings: A Case Study in Estimating the Causal Effects of Peer Review Policies

Raymond Zhang, Neha Nayak Kennard, Daniel Smith, Daniel McFarland, Andrew McCallum, Katherine Keith


Abstract
A promising approach to estimate the causal effects of peer review policies is to analyze data from publication venues that shift policies from single-blind to double-blind from one year to the next. However, in these settings the content of the manuscript is a confounding variable—each year has a different distribution of scientific content which may naturally affect the distribution of reviewer scores. To address this textual confounding, we extend variable ratio nearest neighbor matching to incorporate text embeddings. We compare this matching method to a widely-used causal method of stratified propensity score matching and a baseline of randomly selected matches. For our case study of the ICLR conference shifting from single- to double-blind review from 2017 to 2018, we find human judges prefer manuscript matches from our method in 70% of cases. While the unadjusted estimate of the average causal effect of reviewers’ scores is -0.25, our method shifts the estimate to -0.17, a slightly smaller difference between the outcomes of single- and double-blind policies. We hope this case study enables exploration of additional text-based causal estimation methods and domains in the future.
Anthology ID:
2023.findings-acl.83
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1284–1297
Language:
URL:
https://aclanthology.org/2023.findings-acl.83
DOI:
10.18653/v1/2023.findings-acl.83
Bibkey:
Cite (ACL):
Raymond Zhang, Neha Nayak Kennard, Daniel Smith, Daniel McFarland, Andrew McCallum, and Katherine Keith. 2023. Causal Matching with Text Embeddings: A Case Study in Estimating the Causal Effects of Peer Review Policies. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1284–1297, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Causal Matching with Text Embeddings: A Case Study in Estimating the Causal Effects of Peer Review Policies (Zhang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.83.pdf