Interactive Construction of User-Centric Dictionary for Text Analytics

Ryosuke Kohita, Issei Yoshida, Hiroshi Kanayama, Tetsuya Nasukawa


Abstract
We propose a methodology to construct a term dictionary for text analytics through an interactive process between a human and a machine, which helps the creation of flexible dictionaries with precise granularity required in typical text analysis. This paper introduces the first formulation of interactive dictionary construction to address this issue. To optimize the interaction, we propose a new algorithm that effectively captures an analyst’s intention starting from only a small number of sample terms. Along with the algorithm, we also design an automatic evaluation framework that provides a systematic assessment of any interactive method for the dictionary creation task. Experiments using real scenario based corpora and dictionaries show that our algorithm outperforms baseline methods, and works even with a small number of interactions.
Anthology ID:
2020.acl-main.72
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
789–799
Language:
URL:
https://aclanthology.org/2020.acl-main.72
DOI:
10.18653/v1/2020.acl-main.72
Bibkey:
Cite (ACL):
Ryosuke Kohita, Issei Yoshida, Hiroshi Kanayama, and Tetsuya Nasukawa. 2020. Interactive Construction of User-Centric Dictionary for Text Analytics. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 789–799, Online. Association for Computational Linguistics.
Cite (Informal):
Interactive Construction of User-Centric Dictionary for Text Analytics (Kohita et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.72.pdf
Video:
 http://slideslive.com/38928823