An Approach to Co-reference Resolution and Formula Grounding for Mathematical Identifiers Using Large Language Models

Aamin Dev, Takuto Asakura, Rune Sætre


Abstract
This paper outlines an automated approach to annotate mathematical identifiers in scientific papers — a process historically laborious and costly. We employ state-of-the-art LLMs, including GPT-3.5 and GPT-4, and open-source alternatives to generate a dictionary for annotating mathematical identifiers, linking each identifier to its conceivable descriptions and then assigning these definitions to the respective identifier in- stances based on context. Evaluation metrics include the CoNLL score for co-reference cluster quality and semantic correctness of the annotations.
Anthology ID:
2024.mathnlp-1.1
Volume:
Proceedings of the 2nd Workshop on Mathematical Natural Language Processing @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Marco Valentino, Deborah Ferreira, Mokanarangan Thayaparan, Andre Freitas
Venues:
MathNLP | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/2024.mathnlp-1.1
DOI:
Bibkey:
Cite (ACL):
Aamin Dev, Takuto Asakura, and Rune Sætre. 2024. An Approach to Co-reference Resolution and Formula Grounding for Mathematical Identifiers Using Large Language Models. In Proceedings of the 2nd Workshop on Mathematical Natural Language Processing @ LREC-COLING 2024, pages 1–10, Torino, Italia. ELRA and ICCL.
Cite (Informal):
An Approach to Co-reference Resolution and Formula Grounding for Mathematical Identifiers Using Large Language Models (Dev et al., MathNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.mathnlp-1.1.pdf