Nested Noun Phrase Identification Using BERT

Shweta Misra, Johan Boye


Abstract
For several NLP tasks, an important substep is the identification of noun phrases in running text. This has typically been done by “chunking” – a way of finding minimal noun phrases by token classification. However, chunking-like methods do not represent the fact that noun phrases can be nested. This paper presents a novel method of finding all noun phrases in a sentence, nested to an arbitrary depth, using the BERT model for token classification. We show that our proposed method achieves very good results for both Swedish and English.
Anthology ID:
2024.lrec-main.1062
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
12138–12143
Language:
URL:
https://aclanthology.org/2024.lrec-main.1062
DOI:
Bibkey:
Cite (ACL):
Shweta Misra and Johan Boye. 2024. Nested Noun Phrase Identification Using BERT. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12138–12143, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Nested Noun Phrase Identification Using BERT (Misra & Boye, LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.1062.pdf