Semantic Slot Prediction on low corpus data using finite user defined list

Bharatram Natarajan, Dharani Simma, Chirag Singh, Anish Nediyanchath, Sreoshi Sengupta


Abstract
Semantic slot prediction is one of the important task for natural language understanding (NLU). They depend on the quality and quantity of the human crafted training data, which affects model generalization. With the advent of voice assistants exposing AI platforms to third party developers, training data quality and quantity matters for any machine learning algorithm to learn and generalize properly.AI platforms provides provision to add custom external plist defined by the developers for the training data. Hence we are exploring dataset, called LowCorpusSlotData, containing low corpus training data with larger number of slots and significant test data. We also use external plist for the above dataset to aid in slot identification. We experimented using state of the art architectures like Bi-directional Encoder Representations from Transformers (BERT) with variants and Bi-directional Encoder with Custom Decoder. To address the low corpus problem, we propose a pipeline approach where we extract candidate slot information using the external plist extractor module and feed as input along with utterance.
Anthology ID:
2020.icon-main.44
Volume:
Proceedings of the 17th International Conference on Natural Language Processing (ICON)
Month:
December
Year:
2020
Address:
Indian Institute of Technology Patna, Patna, India
Editors:
Pushpak Bhattacharyya, Dipti Misra Sharma, Rajeev Sangal
Venue:
ICON
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
329–333
Language:
URL:
https://aclanthology.org/2020.icon-main.44
DOI:
Bibkey:
Cite (ACL):
Bharatram Natarajan, Dharani Simma, Chirag Singh, Anish Nediyanchath, and Sreoshi Sengupta. 2020. Semantic Slot Prediction on low corpus data using finite user defined list. In Proceedings of the 17th International Conference on Natural Language Processing (ICON), pages 329–333, Indian Institute of Technology Patna, Patna, India. NLP Association of India (NLPAI).
Cite (Informal):
Semantic Slot Prediction on low corpus data using finite user defined list (Natarajan et al., ICON 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.icon-main.44.pdf