Lost in Distillation: A Case Study in Toxicity Modeling

Alyssa Chvasta, Alyssa Lees, Jeffrey Sorensen, Lucy Vasserman, Nitesh Goyal


Abstract
In an era of increasingly large pre-trained language models, knowledge distillation is a powerful tool for transferring information from a large model to a smaller one. In particular, distillation is of tremendous benefit when it comes to real-world constraints such as serving latency or serving at scale. However, a loss of robustness in language understanding may be hidden in the process and not immediately revealed when looking at high-level evaluation metrics. In this work, we investigate the hidden costs: what is “lost in distillation”, especially in regards to identity-based model bias using the case study of toxicity modeling. With reproducible models using open source training sets, we investigate models distilled from a BERT teacher baseline. Using both open source and proprietary big data models, we investigate these hidden performance costs.
Anthology ID:
2022.woah-1.9
Volume:
Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)
Month:
July
Year:
2022
Address:
Seattle, Washington (Hybrid)
Editors:
Kanika Narang, Aida Mostafazadeh Davani, Lambert Mathias, Bertie Vidgen, Zeerak Talat
Venue:
WOAH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
92–101
Language:
URL:
https://aclanthology.org/2022.woah-1.9
DOI:
10.18653/v1/2022.woah-1.9
Bibkey:
Cite (ACL):
Alyssa Chvasta, Alyssa Lees, Jeffrey Sorensen, Lucy Vasserman, and Nitesh Goyal. 2022. Lost in Distillation: A Case Study in Toxicity Modeling. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 92–101, Seattle, Washington (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Lost in Distillation: A Case Study in Toxicity Modeling (Chvasta et al., WOAH 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.woah-1.9.pdf
Video:
 https://aclanthology.org/2022.woah-1.9.mp4
Data
C4Civil CommentsWikiConv