Towards Classification of Legal Pharmaceutical Text using GAN-BERT

Tapan Auti, Rajdeep Sarkar, Bernardo Stearns, Atul Kr. Ojha, Arindam Paul, Michaela Comerford, Jay Megaro, John Mariano, Vall Herard, John P. McCrae


Abstract
Pharmaceutical text classification is an important area of research for commercial and research institutions working in the pharmaceutical domain. Addressing this task is challenging due to the need of expert verified labelled data which can be expensive and time consuming to obtain. Towards this end, we leverage predictive coding methods for the task as they have been shown to generalise well for sentence classification. Specifically, we utilise GAN-BERT architecture to classify pharmaceutical texts. To capture the domain specificity, we propose to utilise the BioBERT model as our BERT model in the GAN-BERT framework. We conduct extensive evaluation to show the efficacy of our approach over baselines on multiple metrics.
Anthology ID:
2022.csrnlp-1.8
Volume:
Proceedings of the First Computing Social Responsibility Workshop within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Mingyu Wan, Chu-Ren Huang
Venue:
CSRNLP
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
52–57
Language:
URL:
https://aclanthology.org/2022.csrnlp-1.8
DOI:
Bibkey:
Cite (ACL):
Tapan Auti, Rajdeep Sarkar, Bernardo Stearns, Atul Kr. Ojha, Arindam Paul, Michaela Comerford, Jay Megaro, John Mariano, Vall Herard, and John P. McCrae. 2022. Towards Classification of Legal Pharmaceutical Text using GAN-BERT. In Proceedings of the First Computing Social Responsibility Workshop within the 13th Language Resources and Evaluation Conference, pages 52–57, Marseille, France. European Language Resources Association.
Cite (Informal):
Towards Classification of Legal Pharmaceutical Text using GAN-BERT (Auti et al., CSRNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.csrnlp-1.8.pdf