Academics Can Contribute to Domain-Specialized Language Models

Mark Dredze, Genta Indra Winata, Prabhanjan Kambadur, Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, David S Rosenberg, Sebastian Gehrmann


Abstract
Commercially available models dominate academic leaderboards. While impressive, this has concentrated research on creating and adapting general-purpose models to improve NLP leaderboard standings for large language models. However, leaderboards collect many individual tasks and general-purpose models often underperform in specialized domains; domain-specific or adapted models yield superior results. This focus on large general-purpose models excludes many academics and draws attention away from areas where they can make important contributions. We advocate for a renewed focus on developing and evaluating domain- and task-specific models, and highlight the unique role of academics in this endeavor.
Anthology ID:
2024.emnlp-main.293
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5100–5110
Language:
URL:
https://aclanthology.org/2024.emnlp-main.293/
DOI:
10.18653/v1/2024.emnlp-main.293
Bibkey:
Cite (ACL):
Mark Dredze, Genta Indra Winata, Prabhanjan Kambadur, Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, David S Rosenberg, and Sebastian Gehrmann. 2024. Academics Can Contribute to Domain-Specialized Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 5100–5110, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Academics Can Contribute to Domain-Specialized Language Models (Dredze et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.293.pdf