A Dictionary-Based Study of Word Sense Difficulty

David Alfter, Rémi Cardon, Thomas François


Abstract
In this article, we present an exploratory study on perceived word sense difficulty by native and non-native speakers of French. We use a graded lexicon in conjunction with the French Wiktionary to generate tasks in bundles of four items. Annotators manually rate the difficulty of the word senses based on their usage in a sentence by selecting the easiest and the most difficult word sense out of four. Our results show that the native and non-native speakers largely agree when it comes to the difficulty of words. Further, the rankings derived from the manual annotation broadly follow the levels of the words in the graded resource, although these levels were not overtly available to annotators. Using clustering, we investigate whether there is a link between the complexity of a definition and the difficulty of the associated word sense. However, results were inconclusive. The annotated data set is available for research purposes.
Anthology ID:
2022.readi-1.3
Volume:
Proceedings of the 2nd Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI) within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Rodrigo Wilkens, David Alfter, Rémi Cardon, Núria Gala
Venue:
READI
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
17–24
Language:
URL:
https://aclanthology.org/2022.readi-1.3
DOI:
Bibkey:
Cite (ACL):
David Alfter, Rémi Cardon, and Thomas François. 2022. A Dictionary-Based Study of Word Sense Difficulty. In Proceedings of the 2nd Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI) within the 13th Language Resources and Evaluation Conference, pages 17–24, Marseille, France. European Language Resources Association.
Cite (Informal):
A Dictionary-Based Study of Word Sense Difficulty (Alfter et al., READI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.readi-1.3.pdf
Code
 daalft/dicomplex