Lei Sha


2024

pdf bib
Text Attribute Control via Closed-Loop Disentanglement
Lei Sha | Thomas Lukasiewicz
Transactions of the Association for Computational Linguistics, Volume 12

Changing an attribute of a text without changing the content usually requires first disentangling the text into irrelevant attributes and content representations. After that, in the inference phase, the representation of one attribute is tuned to a different value, expecting that the corresponding attribute of the text can also be changed accordingly. The usual way of disentanglement is to add some constraints on the latent space of an encoder-decoder architecture, including adversarial-based constraints and mutual-information-based constraints. However, previous semi-supervised processes of attribute change are usually not enough to guarantee the success of attribute change and content preservation. In this paper, we propose a novel approach to achieve a robust control of attributes while enhancing content preservation. In this approach, we use a semi-supervised contrastive learning method to encourage the disentanglement of attributes in latent spaces. Differently from previous works, we re-disentangle the reconstructed sentence and compare the re-disentangled latent space with the original latent space, which makes a closed-loop disentanglement process. This also helps content preservation. In addition, the contrastive learning method is also able to replace the role of minimizing mutual information and adversarial training in the disentanglement process, which alleviates the computation cost. We conducted experiments on three text datasets, including the Yelp Service review dataset, the Amazon Product review dataset, and the GoEmotions dataset. The experimental results show the effectiveness of our model.

2023

pdf bib
Harnessing the Plug-and-Play Controller by Prompting
Hao Wang | Lei Sha
Proceedings of the Third Workshop on Natural Language Generation, Evaluation, and Metrics (GEM)

Controllable text generation is a growing field within natural language generation (NLG) that focuses on producing text that meets specific constraints in real-world applications. Previous approaches, such as plug-and-play controllers (PPCs), aimed to steer the properties of generated text in a flexible manner. However, these methods often compromised the integrity of the language model’s decoding process, resulting in less smooth text generation.Alternatively, other techniques utilized multiple attribute prompts to align the generated text with desired attributes, but this approach required prompt design for each attribute and was dependent on the size of the language model. This paper introduces a novel method for flexible attribute control in text generation using pre-trained language models (PLMs). The proposed approach aims to enhance the fluency of generated text by guiding the generation process with PPCs. The key idea is to dynamically adjust the distribution of generated text by modifying prompts, effectively constraining the output space of the language model and influencing the desired attribute. To enable smooth cooperation between the PLM and the PPC, our work innovativel proposes a new model fine-tuning method: Reinforcement Learning with Dynamic Adjust Feedback (RLDAF).This fine-tuning process adapts a small subset of the language model’s parameters based on the generating actions taken during the PPC control process. The resulting harmonious collaboration between the PLM and PPC leads to improved smoothness in text generation during inference. Extensive experiments were conducted on the SST2 dataset, and the proposed method outperformed previous approaches in various evaluation metrics, including text fluency and attribute consistency.

2022

pdf bib
RecInDial: A Unified Framework for Conversational Recommendation with Pretrained Language Models
Lingzhi Wang | Huang Hu | Lei Sha | Can Xu | Daxin Jiang | Kam-Fai Wong
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Conversational Recommender System (CRS), which aims to recommend high-quality items to users through interactive conversations, has gained great research interest recently. A CRS is usually composed of a recommendation module and a generation module. In the previous work, these two modules are loosely connected in the model training and are shallowly integrated during inference, where a simple switching or copy mechanism is adopted to incorporate recommended items into generated responses. Moreover, the current end-to-end neural models trained on small crowd-sourcing datasets (e.g., 10K dialogs in the ReDial dataset) tend to overfit and have poor chit-chat ability. In this work, we propose a novel unified framework that integrates recommendation into the dialog (RecInDial) generation by introducing a vocabulary pointer. To tackle the low-resource issue in CRS, we finetune the large-scale pretrained language models to generate fluent and diverse responses, and introduce a knowledge-aware bias learned from an entity-oriented knowledge graph to enhance the recommendation performance. Furthermore, we propose to evaluate the CRS models in an end-to-end manner, which can reflect the overall performance of the entire system rather than the performance of individual modules, compared to the separate evaluations of the two modules used in previous work. Experiments on the benchmark dataset ReDial show our RecInDial model significantly surpasses the state-of-the-art methods. More extensive analyses show the effectiveness of our model.

2021

pdf bib
Controlling Text Edition by Changing Answers of Specific Questions
Lei Sha | Patrick Hohenecker | Thomas Lukasiewicz
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Gradient-guided Unsupervised Lexically Constrained Text Generation
Lei Sha
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Lexically constrained generation requires the target sentence to satisfy some lexical constraints, such as containing some specific words or being the paraphrase to a given sentence, which is very important in many real-world natural language generation applications. Previous works usually apply beam-search-based methods or stochastic searching methods to lexically-constrained generation. However, when the search space is too large, beam-search-based methods always fail to find the constrained optimal solution. At the same time, stochastic search methods always cost too many steps to find the correct optimization direction. In this paper, we propose a novel method G2LC to solve the lexically-constrained generation as an unsupervised gradient-guided optimization problem. We propose a differentiable objective function and use the gradient to help determine which position in the sequence should be changed (deleted or inserted/replaced by another word). The word updating process of the inserted/replaced word also benefits from the guidance of gradient. Besides, our method is free of parallel data training, which is flexible to be used in the inference stage of any pre-trained generation model. We apply G2LC to two generation tasks: keyword-to-sentence generation and unsupervised paraphrase generation. The experiment results show that our method achieves state-of-the-art compared to previous lexically-constrained methods.

2018

pdf bib
Auto-Dialabel: Labeling Dialogue Data with Unsupervised Learning
Chen Shi | Qi Chen | Lei Sha | Sujian Li | Xu Sun | Houfeng Wang | Lintao Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The lack of labeled data is one of the main challenges when building a task-oriented dialogue system. Existing dialogue datasets usually rely on human labeling, which is expensive, limited in size, and in low coverage. In this paper, we instead propose our framework auto-dialabel to automatically cluster the dialogue intents and slots. In this framework, we collect a set of context features, leverage an autoencoder for feature assembly, and adapt a dynamic hierarchical clustering method for intent and slot labeling. Experimental results show that our framework can promote human labeling cost to a great extent, achieve good intent clustering accuracy (84.1%), and provide reasonable and instructive slot labeling results.

2017

pdf bib
Syntax Aware LSTM model for Semantic Role Labeling
Feng Qian | Lei Sha | Baobao Chang | Lu-chen Liu | Ming Zhang
Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing

In Semantic Role Labeling (SRL) task, the tree structured dependency relation is rich in syntax information, but it is not well handled by existing models. In this paper, we propose Syntax Aware Long Short Time Memory (SA-LSTM). The structure of SA-LSTM changes according to dependency structure of each sentence, so that SA-LSTM can model the whole tree structure of dependency relation in an architecture engineering way. Experiments demonstrate that on Chinese Proposition Bank (CPB) 1.0, SA-LSTM improves F1 by 2.06% than ordinary bi-LSTM with feature engineered dependency relation information, and gives state-of-the-art F1 of 79.92%. On English CoNLL 2005 dataset, SA-LSTM brings improvement (2.1%) to bi-LSTM model and also brings slight improvement (0.3%) when added to the state-of-the-art model.

pdf bib
A Progressive Learning Approach to Chinese SRL Using Heterogeneous Data
Qiaolin Xia | Lei Sha | Baobao Chang | Zhifang Sui
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Previous studies on Chinese semantic role labeling (SRL) have concentrated on a single semantically annotated corpus. But the training data of single corpus is often limited. Whereas the other existing semantically annotated corpora for Chinese SRL are scattered across different annotation frameworks. But still, Data sparsity remains a bottleneck. This situation calls for larger training datasets, or effective approaches which can take advantage of highly heterogeneous data. In this paper, we focus mainly on the latter, that is, to improve Chinese SRL by using heterogeneous corpora together. We propose a novel progressive learning model which augments the Progressive Neural Network with Gated Recurrent Adapters. The model can accommodate heterogeneous inputs and effectively transfer knowledge between them. We also release a new corpus, Chinese SemBank, for Chinese SRL. Experiments on CPB 1.0 show that our model outperforms state-of-the-art methods.

2016

pdf bib
Capturing Argument Relationship for Chinese Semantic Role Labeling
Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui | Tingsong Jiang
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Encoding Temporal Information for Time-Aware Link Prediction
Tingsong Jiang | Tianyu Liu | Tao Ge | Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Learning Templates and Slots for Event Schema Induction
Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Towards Time-Aware Knowledge Graph Completion
Tingsong Jiang | Tianyu Liu | Tao Ge | Lei Sha | Baobao Chang | Sujian Li | Zhifang Sui
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Knowledge graph (KG) completion adds new facts to a KG by making inferences from existing facts. Most existing methods ignore the time information and only learn from time-unknown fact triples. In dynamic environments that evolve over time, it is important and challenging for knowledge graph completion models to take into account the temporal aspects of facts. In this paper, we present a novel time-aware knowledge graph completion model that is able to predict links in a KG using both the existing facts and the temporal information of the facts. To incorporate the happening time of facts, we propose a time-aware KG embedding model using temporal order information among facts. To incorporate the valid time of facts, we propose a joint time-aware inference model based on Integer Linear Programming (ILP) using temporal consistencyinformationasconstraints. Wefurtherintegratetwomodelstomakefulluseofglobal temporal information. We empirically evaluate our models on time-aware KG completion task. Experimental results show that our time-aware models achieve the state-of-the-art on temporal facts consistently.

pdf bib
Reading and Thinking: Re-read LSTM Unit for Textual Entailment Recognition
Lei Sha | Baobao Chang | Zhifang Sui | Sujian Li
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Recognizing Textual Entailment (RTE) is a fundamentally important task in natural language processing that has many applications. The recently released Stanford Natural Language Inference (SNLI) corpus has made it possible to develop and evaluate deep neural network methods for the RTE task. Previous neural network based methods usually try to encode the two sentences (premise and hypothesis) and send them together into a multi-layer perceptron to get their entailment type, or use LSTM-RNN to link two sentences together while using attention mechanic to enhance the model’s ability. In this paper, we propose to use the re-read mechanic, which means to read the premise again and again while reading the hypothesis. After read the premise again, the model can get a better understanding of the premise, which can also affect the understanding of the hypothesis. On the contrary, a better understanding of the hypothesis can also affect the understanding of the premise. With the alternative re-read process, the model can “think” of a better decision of entailment type. We designed a new LSTM unit called re-read LSTM (rLSTM) to implement this “thinking” process. Experiments show that we achieve results better than current state-of-the-art equivalents.

pdf bib
RBPB: Regularization-Based Pattern Balancing Method for Event Extraction
Lei Sha | Jing Liu | Chin-Yew Lin | Sujian Li | Baobao Chang | Zhifang Sui
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Multi-label Text Categorization with Joint Learning Predictions-as-Features Method
Li Li | Houfeng Wang | Xu Sun | Baobao Chang | Shi Zhao | Lei Sha
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Recognizing Textual Entailment Using Probabilistic Inference
Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui | Tingsong Jiang
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing