Daniel Van Strien
Also published as: Daniel van Strien
2024
Proceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia
Lucie Lucie-Aimée
|
Angela Fan
|
Tajuddeen Gwadabe
|
Isaac Johnson
|
Fabio Petroni
|
Daniel van Strien
Proceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia
2022
Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
Francesco De Toni
|
Christopher Akiki
|
Javier De La Rosa
|
Clémentine Fourrier
|
Enrique Manjavacas
|
Stefan Schweter
|
Daniel Van Strien
Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models
In this work, we explore whether the recently demonstrated zero-shot abilities of the T0 model extend to Named Entity Recognition for out-of-distribution languages and time periods. Using a historical newspaper corpus in 3 languages as test-bed, we use prompts to extract possible named entities. Our results show that a naive approach for prompt-based zero-shot multilingual Named Entity Recognition is error-prone, but highlights the potential of such an approach for historical languages lacking labeled datasets. Moreover, we also find that T0-like models can be probed to predict the publication date and language of a document, which could be very relevant for the study of historical texts.
Search