Automatic Assessment Of Spoken English Proficiency Based on Multimodal and Multitask Transformers

Kamel Nebhi, György Szaszák


Abstract
This paper describes technology developed to automatically grade students on their English spontaneous spoken language proficiency with common european framework of reference for languages (CEFR) level. Our automated assessment system contains two tasks: elicited imitation and spontaneous speech assessment. Spontaneous speech assessment is a challenging task that requires evaluating various aspects of speech quality, content, and coherence. In this paper, we propose a multimodal and multitask transformer model that leverages both audio and text features to perform three tasks: scoring, coherence modeling, and prompt relevancy scoring. Our model uses a fusion of multiple features and multiple modality attention to capture the interactions between audio and text modalities and learn from different sources of information.
Anthology ID:
2023.ranlp-1.83
Volume:
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
769–776
Language:
URL:
https://aclanthology.org/2023.ranlp-1.83
DOI:
Bibkey:
Cite (ACL):
Kamel Nebhi and György Szaszák. 2023. Automatic Assessment Of Spoken English Proficiency Based on Multimodal and Multitask Transformers. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 769–776, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Automatic Assessment Of Spoken English Proficiency Based on Multimodal and Multitask Transformers (Nebhi & Szaszák, RANLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ranlp-1.83.pdf