Zhu Liu


2024

pdf bib
Fantastic Semantics and Where to Find Them: Investigating Which Layers of Generative LLMs Reflect Lexical Semantics
Zhu Liu | Cunliang Kong | Ying Liu | Maosong Sun
Findings of the Association for Computational Linguistics ACL 2024

Large language models have achieved remarkable success in general language understanding tasks. However, as a family of generative methods with the objective of next token prediction, the semantic evolution with the depth of these models are not fully explored, unlike their predecessors, such as BERT-like architectures. In this paper, we specifically investigate the bottom-up evolution of lexical semantics for a popular LLM, namely Llama2, by probing its hidden states at the end of each layer using a contextualized word identification task. Our experiments show that the representations in lower layers encode lexical semantics, while the higher layers, with weaker semantic induction, are responsible for prediction. This is in contrast to models with discriminative objectives, such as mask language modeling, where the higher layers obtain better lexical semantics. The conclusion is further supported by the monotonic increase in performance via the hidden states for the last meaningless symbols, such as punctuation, in the prompting strategy. Our codes are available at https://github.com/RyanLiut/LLM_LexSem.

2023

pdf bib
Ambiguity Meets Uncertainty: Investigating Uncertainty Estimation for Word Sense Disambiguation
Zhu Liu | Ying Liu
Findings of the Association for Computational Linguistics: ACL 2023

Word sense disambiguation (WSD), which aims to determine an appropriate sense for a target word given its context, is crucial for natural language understanding. Existing supervised methods treat WSD as a classification task and have achieved remarkable performance. However, they ignore uncertainty estimation (UE) in the real-world setting, where the data is always noisy and out of distribution. This paper extensively studies UE on the benchmark designed for WSD. Specifically, we first compare four uncertainty scores for a state-of-the-art WSD model and verify that the conventional predictive probabilities obtained at the end of the model are inadequate to quantify uncertainty. Then, we examine the capability of capturing data and model uncertainties by the model with the selected UE score on well-designed test scenarios and discover that the model reflects data uncertainty satisfactorily but underestimates model uncertainty. Furthermore, we explore numerous lexical properties that intrinsically affect data uncertainty and provide a detailed analysis of four critical aspects: the syntactic category, morphology, sense granularity, and semantic relations.

2007

pdf bib
The Multimodal Presentation Dashboard
Michael Johnston | Patrick Ehlen | David Gibbon | Zhu Liu
Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies

2004

pdf bib
A System for Searching and Browsing Spoken Communications
Lee Begeja | Bernard Renger | Murat Saraclar | David Gibbon | Zhu Liu | Behzad Shahraray
Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004

pdf bib
Interactive Machine Learning Techniques for Improving SLU Models
Lee Begeja | Bernard Renger | David Gibbon | Zhu Liu | Behzad Shahraray
Proceedings of the HLT-NAACL 2004 Workshop on Spoken Language Understanding for Conversational Systems and Higher Level Linguistic Information for Speech Processing