2024
pdf
bib
abs
Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs
Chenxi Sun
|
Hongzhi Zhang
|
Zijia Lin
|
Jingyuan Zhang
|
Fuzheng Zhang
|
Zhongyuan Wang
|
Bin Chen
|
Chengru Song
|
Di Zhang
|
Kun Gai
|
Deyi Xiong
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Large language models have demonstrated exceptional capability in natural language understanding and generation. However, their generation speed is limited by the inherently sequential nature of their decoding process, posing challenges for real-time applications. This paper introduces Lexical Unit Decoding (LUD), a novel decoding methodology implemented in a data-driven manner, accelerating the decoding process without sacrificing output quality. The core of our approach is the observation that a pre-trained language model can confidently predict multiple contiguous tokens, forming the basis for a lexical unit, in which these contiguous tokens could be decoded in parallel. Extensive experiments validate that our method substantially reduces decoding time while maintaining generation quality, i.e., 33% speed up on natural language generation with no quality loss, and 30% speed up on code generation with a negligible quality loss of 3%. Distinctively, LUD requires no auxiliary models and does not require changes to existing architectures. It can also be integrated with other decoding acceleration methods, thus achieving an even more pronounced inference efficiency boost. We posit that the foundational principles of LUD could define a new decoding paradigm for future language models, enhancing their applicability for a broader spectrum of applications. All codes are be publicly available at https://github.com/tjunlp-lab/Lexical-Unit-Decoding-LUD-.
pdf
bib
abs
Mitigating Linguistic Artifacts in Emotion Recognition for Conversations from TV Scripts to Daily Conversations
Donovan Ong
|
Shuo Sun
|
Jian Su
|
Bin Chen
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Emotion Recognition in Conversations (ERC) is a well-studied task with numerous potential real-world applications. However, existing ERC models trained on the MELD dataset derived from TV series, struggle when applied to daily conversation datasets. A closer examination of the datasets unveils the prevalence of linguistic artifacts such as repetitions and interjections in TV scripts, which ERC models may exploit when making predictions. To address this issue, we explore two techniques aimed at reducing the reliance of ERC models on these artifacts: 1) using contrastive learning to prioritize emotional features over dataset-specific linguistic style and 2) refining emotion predictions with pseudo-emotion intensity score. Our experiment results show that reducing reliance on the linguistic style found in TV transcripts could enhance model’s robustness and accuracy in diverse conversational contexts.
2023
pdf
bib
abs
An Exploratory Study on Model Compression for Text-to-SQL
Shuo Sun
|
Yuze Gao
|
Yuchen Zhang
|
Jian Su
|
Bin Chen
|
Yingzhan Lin
|
Shuqi Sun
Findings of the Association for Computational Linguistics: ACL 2023
Text-to-SQL translates user queries into SQL statements that can retrieve relevant answers from relational databases. Recent approaches to Text-to-SQL rely on pre-trained language models that are computationally expensive and technically challenging to deploy in real-world applications that require real-time or on-device processing capabilities. In this paper, we perform a focused study on the feasibility of applying recent model compression techniques to sketch-based and sequence-to-sequence Text-to-SQL models. Our results reveal that sketch-based Text-to-SQL models generally have higher inference efficiency and respond better to model compression than sequence-to-sequence models, making them ideal for real-world deployments, especially in use cases with simple SQL statements.
pdf
bib
abs
Battle of the Large Language Models: Dolly vs LLaMA vs Vicuna vs Guanaco vs Bard vs ChatGPT - A Text-to-SQL Parsing Comparison
Shuo Sun
|
Yuchen Zhang
|
Jiahuan Yan
|
Yuze Gao
|
Donovan Ong
|
Bin Chen
|
Jian Su
Findings of the Association for Computational Linguistics: EMNLP 2023
The success of ChatGPT has ignited an AI race, with researchers striving to develop new large language models (LLMs) that can match or surpass the language understanding and generation abilities of commercial ones. In recent times, a number of models have emerged, claiming performance near that of GPT-3.5 or GPT-4 through various instruction-tuning methods. As practitioners of Text-to-SQL parsing, we are grateful for their valuable contributions to open-source research. However, it is important to approach these claims with a sense of scrutiny and ascertain the actual effectiveness of these models. Therefore, we pit six popular large language models against each other, systematically evaluating their Text-to-SQL parsing capability on nine benchmark datasets with five different prompting strategies, covering both zero-shot and few-shot scenarios. Regrettably, the open-sourced models fell significantly short of the performance achieved by closed-source models like GPT-3.5, highlighting the need for further work to bridge the performance gap between these models.
2015
pdf
bib
Improving Twitter Named Entity Recognition using Word Representations
Zhiqiang Toh
|
Bin Chen
|
Jian Su
Proceedings of the Workshop on Noisy User-generated Text
2013
pdf
bib
Exploiting Discourse Analysis for Article-Wide Temporal Classification
Jun-Ping Ng
|
Min-Yen Kan
|
Ziheng Lin
|
Wei Feng
|
Bin Chen
|
Jian Su
|
Chew-Lim Tan
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
2011
pdf
bib
A Unified Event Coreference Resolution by Integrating Multiple Resolvers
Bin Chen
|
Jian Su
|
Sinno Jialin Pan
|
Chew Lim Tan
Proceedings of 5th International Joint Conference on Natural Language Processing
2010
pdf
bib
Resolving Event Noun Phrases to Their Verbal Mentions
Bin Chen
|
Jian Su
|
Chew Lim Tan
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
pdf
bib
A Twin-Candidate Based Approach for Event Pronoun Resolution using Composite Kernel
Bin Chen
|
Jian Su
|
Chew Lim Tan
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)
2008
pdf
bib
Other-Anaphora Resolution in Biomedical Texts with Automatically Mined Patterns
Bin Chen
|
Xiaofeng Yang
|
Jian Su
|
Chew Lim Tan
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)