How Proficient Are Large Language Models in Formal Languages? An In-Depth Insight for Knowledge Base Question Answering

Jinxin Liu, Shulin Cao, Jiaxin Shi, Tingjian Zhang, Lunyiu Nie, Linmei Hu, Lei Hou, Juanzi Li


Abstract
Knowledge Base Question Answering (KBQA) aims to answer natural language questions based on facts in knowledge bases. A typical approach to KBQA is semantic parsing, which translates a question into an executable logical form in a formal language. Recent works leverage the capabilities of large language models (LLMs) for logical form generation to improve performance. However, although it is validated that LLMs are capable of solving some KBQA problems, there has been little discussion on the differences in LLMs’ proficiency in formal languages used in semantic parsing. In this work, we propose to evaluate the understanding and generation ability of LLMs to deal with differently structured logical forms by examining the inter-conversion of natural and formal language through in-context learning of LLMs. Extensive experiments with models of different sizes show that state-of-the-art LLMs can understand formal languages as well as humans, but generating correct logical forms given a few examples remains a challenge. Most importantly, our results also indicate that LLMs exhibit considerable sensitivity. In general, the formal language with a lower formalization level, i.e., the more similar it is to natural language, is more friendly to LLMs. Code and data can be found at https://github.com/Matthewlliu/structure_probe.
Anthology ID:
2024.findings-acl.45
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
792–815
Language:
URL:
https://aclanthology.org/2024.findings-acl.45
DOI:
10.18653/v1/2024.findings-acl.45
Bibkey:
Cite (ACL):
Jinxin Liu, Shulin Cao, Jiaxin Shi, Tingjian Zhang, Lunyiu Nie, Linmei Hu, Lei Hou, and Juanzi Li. 2024. How Proficient Are Large Language Models in Formal Languages? An In-Depth Insight for Knowledge Base Question Answering. In Findings of the Association for Computational Linguistics ACL 2024, pages 792–815, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
How Proficient Are Large Language Models in Formal Languages? An In-Depth Insight for Knowledge Base Question Answering (Liu et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.45.pdf