Knowledge Discovery in COVID-19 Research Literature

Alejandro Piad-Morffis, Suilan Estevez-Velarde, Ernesto Luis Estevanell-Valladares, Yoan Gutiérrez, Andrés Montoyo, Rafael Muñoz, Yudivián Almeida-Cruz


Abstract
This paper presents the preliminary results of an ongoing project that analyzes the growing body of scientific research published around the COVID-19 pandemic. In this research, a general-purpose semantic model is used to double annotate a batch of 500 sentences that were manually selected by the researchers from the CORD-19 corpus. Afterwards, a baseline text-mining pipeline is designed and evaluated via a large batch of 100,959 sentences. We present a qualitative analysis of the most interesting facts automatically extracted and highlight possible future lines of development. The preliminary results show that general-purpose semantic models are a useful tool for discovering fine-grained knowledge in large corpora of scientific documents.
Anthology ID:
2020.nlpcovid19-2.22
Volume:
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Month:
December
Year:
2020
Address:
Online
Editors:
Karin Verspoor, Kevin Bretonnel Cohen, Michael Conway, Berry de Bruijn, Mark Dredze, Rada Mihalcea, Byron Wallace
Venue:
NLP-COVID19
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/2020.nlpcovid19-2.22
DOI:
10.18653/v1/2020.nlpcovid19-2.22
Bibkey:
Cite (ACL):
Alejandro Piad-Morffis, Suilan Estevez-Velarde, Ernesto Luis Estevanell-Valladares, Yoan Gutiérrez, Andrés Montoyo, Rafael Muñoz, and Yudivián Almeida-Cruz. 2020. Knowledge Discovery in COVID-19 Research Literature. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, Online. Association for Computational Linguistics.
Cite (Informal):
Knowledge Discovery in COVID-19 Research Literature (Piad-Morffis et al., NLP-COVID19 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.nlpcovid19-2.22.pdf