Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness

Fatemehsadat Mireshghallah, Taylor Berg-Kirkpatrick


Abstract
Text style can reveal sensitive attributes of the author (e.g. age and race) to the reader, which can, in turn, lead to privacy violations and bias in both human and algorithmic decisions based on text. For example, the style of writing in job applications might reveal protected attributes of the candidate which could lead to bias in hiring decisions, regardless of whether hiring decisions are made algorithmically or by humans. We propose a VAE-based framework that obfuscates stylistic features of human-generated text through style transfer, by automatically re-writing the text itself. Critically, our framework operationalizes the notion of obfuscated style in a flexible way that enables two distinct notions of obfuscated style: (1) a minimal notion that effectively intersects the various styles seen in training, and (2) a maximal notion that seeks to obfuscate by adding stylistic features of all sensitive attributes to text, in effect, computing a union of styles. Our style-obfuscation framework can be used for multiple purposes, however, we demonstrate its effectiveness in improving the fairness of downstream classifiers. We also conduct a comprehensive study on style-pooling’s effect on fluency, semantic consistency, and attribute removal from text, in two and three domain style transfer.
Anthology ID:
2021.emnlp-main.152
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2009–2022
Language:
URL:
https://aclanthology.org/2021.emnlp-main.152
DOI:
10.18653/v1/2021.emnlp-main.152
Bibkey:
Cite (ACL):
Fatemehsadat Mireshghallah and Taylor Berg-Kirkpatrick. 2021. Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2009–2022, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness (Mireshghallah & Berg-Kirkpatrick, EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.152.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.152.mp4
Code
 mireshghallah/style-pooling