miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings

Tassilo Klein, Moin Nabi


Abstract
This paper presents miCSE, a mutual information-based contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding. The proposed approach imposes alignment between the attention pattern of different views during contrastive learning. Learning sentence embeddings with miCSE entails enforcing the structural consistency across augmented views for every sentence, making contrastive self-supervised learning more sample efficient. As a result, the proposed approach shows strong performance in the few-shot learning domain. While it achieves superior results compared to state-of-the-art methods on multiple benchmarks in few-shot learning, it is comparable in the full-shot scenario. This study opens up avenues for efficient self-supervised learning methods that are more robust than current contrastive methods for sentence embedding.
Anthology ID:
2023.acl-long.339
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6159–6177
Language:
URL:
https://aclanthology.org/2023.acl-long.339
DOI:
10.18653/v1/2023.acl-long.339
Bibkey:
Cite (ACL):
Tassilo Klein and Moin Nabi. 2023. miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6159–6177, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings (Klein & Nabi, ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.339.pdf
Video:
 https://aclanthology.org/2023.acl-long.339.mp4