2024
pdf
bib
abs
BlendSQL: A Scalable Dialect for Unifying Hybrid Question Answering in Relational Algebra
Parker Glenn
|
Parag Dakle
|
Liang Wang
|
Preethi Raghavan
Findings of the Association for Computational Linguistics ACL 2024
Many existing end-to-end systems for hybrid question answering tasks can often be boiled down to a “prompt-and-pray” paradigm, where the user has limited control and insight into the intermediate reasoning steps used to achieve the final result. Additionally, due to the context size limitation of many transformer-based LLMs, it is often not reasonable to expect that the full structured and unstructured context will fit into a given prompt in a zero-shot setting, let alone a few-shot setting. We introduce BlendSQL, a superset of SQLite to act as a unified dialect for orchestrating reasoning across both unstructured and structured data. For hybrid question answering tasks involving multi-hop reasoning, we encode the full decomposed reasoning roadmap into a single interpretable BlendSQL query. Notably, we show that BlendSQL can scale to massive datasets and improve the performance of end-to-end systems while using 35% fewer tokens. Our code is available and installable as a package at https://github.com/parkervg/blendsql.
2023
pdf
bib
abs
Correcting Semantic Parses with Natural Language through Dynamic Schema Encoding
Parker Glenn
|
Parag Pravin Dakle
|
Preethi Raghavan
Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023)
In addressing the task of converting natural language to SQL queries, there are several semantic and syntactic challenges. It becomes increasingly important to understand and remedy the points of failure as the performance of semantic parsing systems improve. We explore semantic parse correction with natural language feedback, proposing a new solution built on the success of autoregressive decoders in text-to-SQL tasks. By separating the semantic and syntactic difficulties of the task, we show that the accuracy of text-to-SQL parsers can be boosted by up to 26% with only one turn of correction with natural language. Additionally, we show that a T5-base model is capable of correcting the errors of a T5-large model in a zero-shot, cross-parser setting.
pdf
bib
Jetsons at the FinNLP-2023: Using Synthetic Data and Transfer Learning for Multilingual ESG Issue Classification
Parker Glenn
|
Alolika Gon
|
Nikhil Kohli
|
Sihan Zha
|
Parag Pravin Dakle
|
Preethi Raghavan
Proceedings of the Fifth Workshop on Financial Technology and Natural Language Processing and the Second Multimodal AI For Financial Forecasting
2022
pdf
bib
abs
The Viability of Best-worst Scaling and Categorical Data Label Annotation Tasks in Detecting Implicit Bias
Parker Glenn
|
Cassandra L. Jacobs
|
Marvin Thielk
|
Yi Chu
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022
Annotating workplace bias in text is a noisy and subjective task. In encoding the inherently continuous nature of bias, aggregated binary classifications do not suffice. Best-worst scaling (BWS) offers a framework to obtain real-valued scores through a series of comparative evaluations, but it is often impractical to deploy to traditional annotation pipelines within industry. We present analyses of a small-scale bias dataset, jointly annotated with categorical annotations and BWS annotations. We show that there is a strong correlation between observed agreement and BWS score (Spearman’s r=0.72). We identify several shortcomings of BWS relative to traditional categorical annotation: (1) When compared to categorical annotation, we estimate BWS takes approximately 4.5x longer to complete; (2) BWS does not scale well to large annotation tasks with sparse target phenomena; (3) The high correlation between BWS and the traditional task shows that the benefits of BWS can be recovered from a simple categorically annotated, non-aggregated dataset.