MSD-1030: A Well-built Multi-Sense Evaluation Dataset for Sense Representation Models

Ting-Yu Yen, Yang-Yin Lee, Yow-Ting Shiue, Hen-Hsen Huang, Hsin-Hsi Chen


Abstract
Sense embedding models handle polysemy by giving each distinct meaning of a word form a separate representation. They are considered improvements over word models, and their effectiveness is usually judged with benchmarks such as semantic similarity datasets. However, most of these datasets are not designed for evaluating sense embeddings. In this research, we show that there are at least six concerns about evaluating sense embeddings with existing benchmark datasets, including the large proportions of single-sense words and the unexpected inferior performance of several multi-sense models to their single-sense counterparts. These observations call into serious question whether evaluations based on these datasets can reflect the sense model’s ability to capture different meanings. To address the issues, we propose the Multi-Sense Dataset (MSD-1030), which contains a high ratio of multi-sense word pairs. A series of analyses and experiments show that MSD-1030 serves as a more reliable benchmark for sense embeddings. The dataset is available at http://nlg.csie.ntu.edu.tw/nlpresource/MSD-1030/.
Anthology ID:
2020.lrec-1.711
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5802–5809
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.711
DOI:
Bibkey:
Cite (ACL):
Ting-Yu Yen, Yang-Yin Lee, Yow-Ting Shiue, Hen-Hsen Huang, and Hsin-Hsi Chen. 2020. MSD-1030: A Well-built Multi-Sense Evaluation Dataset for Sense Representation Models. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5802–5809, Marseille, France. European Language Resources Association.
Cite (Informal):
MSD-1030: A Well-built Multi-Sense Evaluation Dataset for Sense Representation Models (Yen et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.711.pdf