Fengran Mo


2024

pdf bib
History-Aware Conversational Dense Retrieval
Fengran Mo | Chen Qu | Kelong Mao | Tianyu Zhu | Zhan Su | Kaiyu Huang | Jian-Yun Nie
Findings of the Association for Computational Linguistics ACL 2024

Conversational search facilitates complex information retrieval by enabling multi-turn interactions between users and the system. Supporting such interactions requires a comprehensive understanding of the conversational inputs to formulate a good search query based on historical information. In particular, the search query should include the relevant information from the previous conversation turns.However, current approaches for conversational dense retrieval primarily rely on fine-tuning a pre-trained ad-hoc retriever using the whole conversational search session, which can be lengthy and noisy. Moreover, existing approaches are limited by the amount of manual supervision signals in the existing datasets.To address the aforementioned issues, we propose a **H**istory-**A**ware **Conv**ersational **D**ense **R**etrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns.Experiments on two public conversational search datasets demonstrate the improved history modeling capability of HAConvDR, in particular for long conversations with topic shifts.

pdf bib
DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution
Yulong Mao | Kaiyu Huang | Changhao Guan | Ganglin Bao | Fengran Mo | Jinan Xu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Fine-tuning large-scale pre-trained models is inherently a resource-intensive task. While it can enhance the capabilities of the model, it also incurs substantial computational costs, posing challenges to the practical application of downstream tasks. Existing parameter-efficient fine-tuning (PEFT) methods such as Low-Rank Adaptation (LoRA) rely on a bypass framework that ignores the differential parameter budget requirements across weight matrices, which may lead to suboptimal fine-tuning outcomes. To address this issue, we introduce the Dynamic Low-Rank Adaptation (DoRA) method. DoRA decomposes high-rank LoRA layers into structured single-rank components, allowing for dynamic pruning of parameter budget based on their importance to specific tasks during training, which makes the most of the limited parameter budget. Experimental results demonstrate that DoRA can achieve competitive performance compared with LoRA and full model fine-tuning, and outperform various strong baselines with the same storage parameter budget. Our code is available at [github](https://github.com/MIkumikumi0116/DoRA)

2023

pdf bib
Search-Oriented Conversational Query Editing
Kelong Mao | Zhicheng Dou | Bang Liu | Hongjin Qian | Fengran Mo | Xiangli Wu | Xiaohua Cheng | Zhao Cao
Findings of the Association for Computational Linguistics: ACL 2023

Conversational query rewriting (CQR) realizes conversational search by reformulating the search dialogue into a standalone rewrite. However, existing CQR models either are not learned toward improving the downstream search performance or inefficiently generate the rewrite token-by-token from scratch while neglecting the fact that the search dialogue often has a large overlap with the rewrite. In this paper, we propose EdiRCS, a new text editing-based CQR model tailored for conversational search. In EdiRCS, most of the rewrite tokens are selected from the dialogue in a non-autoregressive fashion and only a few new tokens are generated to supplement the final rewrite, which makes EdiRCS highly efficient. In particular, the learning of EdiRCS is augmented with two search-oriented objectives, including contrastive ranking augmentation and contextualization knowledge transfer, which effectively improve it to select and generate more useful tokens from the view of retrieval. We show that EdiRCS outperforms state-of-the-art CQR models on three conversational search benchmarks while having low rewriting latency, and is robust to out-of-domain search dialogues and long dialogue contexts.

pdf bib
A Customized Text Sanitization Mechanism with Differential Privacy
Sai Chen | Fengran Mo | Yanhao Wang | Cen Chen | Jian-Yun Nie | Chengyu Wang | Jamie Cui
Findings of the Association for Computational Linguistics: ACL 2023

As privacy issues are receiving increasing attention within the Natural Language Processing (NLP) community, numerous methods have been proposed to sanitize texts subject to differential privacy. However, the state-of-the-art text sanitization mechanisms based on a relaxed notion of metric local differential privacy (MLDP) do not apply to non-metric semantic similarity measures and cannot achieve good privacy-utility trade-offs. To address these limitations, we propose a novel Customized Text sanitization (CusText) mechanism based on the original đťś–-differential privacy (DP) definition, which is compatible with any similarity measure.Moreover, CusText assigns each input token a customized output set to provide more advanced privacy protection at the token level.Extensive experiments on several benchmark datasets show that CusText achieves a better trade-off between privacy and utility than existing mechanisms.The code is available at https://github.com/sai4july/CusText.

pdf bib
MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model
Le Zhang | Yihong Wu | Fengran Mo | Jian-Yun Nie | Aishwarya Agrawal
Findings of the Association for Computational Linguistics: EMNLP 2023

Multi-modal open-domain question answering typically requires evidence retrieval from databases across diverse modalities, such as images, tables, passages, etc. Even Large Language Models (LLMs) like GPT-4 fall short in this task. To enable LLMs to tackle the task in a zero-shot manner, we introduce MoqaGPT, a straightforward and flexible framework. Using a divide-and-conquer strategy that bypasses intricate multi-modality ranking, our framework can accommodate new modalities and seamlessly transition to new models for the task. Built upon LLMs, MoqaGPT retrieves and extracts answers from each modality separately, then fuses this multi-modal information using LLMs to produce a final answer. Our methodology boosts performance on the MMCoQA dataset, improving F1 by +37.91 points and EM by +34.07 points over the supervised baseline. On the MultiModalQA dataset, MoqaGPT surpasses the zero-shot baseline, improving F1 by 9.5 points and EM by 10.1 points, and significantly closes the gap with supervised methods. Our codebase is available at https://github.com/lezhang7/MOQAGPT.

pdf bib
Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search
Kelong Mao | Zhicheng Dou | Fengran Mo | Jiewen Hou | Haonan Chen | Hongjin Qian
Findings of the Association for Computational Linguistics: EMNLP 2023

Precisely understanding users’ contextual search intent has been an important challenge for conversational search. As conversational search sessions are much more diverse and long-tailed, existing methods trained on limited data still show unsatisfactory effectiveness and robustness to handle real conversational search scenarios. Recently, large language models (LLMs) have demonstrated amazing capabilities for text generation and conversation understanding. In this work, we present a simple yet effective prompting framework, called LLM4CS, to leverage LLMs as a text-based search intent interpreter to help conversational search. Under this framework, we explore three prompting methods to generate multiple query rewrites and hypothetical responses, and propose to aggregate them into an integrated representation that can robustly represent the user’s real contextual search intent. Extensive automatic evaluations and human evaluations on three widely used conversational search benchmarks, including CAsT-19, CAsT-20, and CAsT-21, demonstrate the remarkable performance of our simple LLM4CS framework compared with existing methods and even using human rewrites. Our findings provide important evidence to better understand and leverage LLMs for conversational search.

pdf bib
ConvGQR: Generative Query Reformulation for Conversational Search
Fengran Mo | Kelong Mao | Yutao Zhu | Yihong Wu | Kaiyu Huang | Jian-Yun Nie
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In conversational search, the user’s real search intent for the current conversation turn is dependent on the previous conversation history. It is challenging to determine a good search query from the whole conversation context. To avoid the expensive re-training of the query encoder, most existing methods try to learn a rewriting model to de-contextualize the current query by mimicking the manual query rewriting. However, manually rewritten queries are not always the best search queries. Thus, training a rewriting model on them would lead to sub-optimal queries. Another useful information to enhance the search query is the potential answer to the question. In this paper, we propose ConvGQR, a new framework to reformulate conversational queries based on generative pre-trained language models (PLMs), one for query rewriting and another for generating potential answers. By combining both, ConvGQR can produce better search queries. In addition, to relate query reformulation to the retrieval task, we propose a knowledge infusion mechanism to optimize both query reformulation and retrieval. Extensive experiments on four conversational search datasets demonstrate the effectiveness of ConvGQR.

2022

pdf bib
ConvTrans: Transforming Web Search Sessions for Conversational Dense Retrieval
Kelong Mao | Zhicheng Dou | Hongjin Qian | Fengran Mo | Xiaohua Cheng | Zhao Cao
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Conversational search provides users with a natural and convenient new search experience. Recently, conversational dense retrieval has shown to be a promising technique for realizing conversational search. However, as conversational search systems have not been widely deployed, it is hard to get large-scale real conversational search sessions and relevance labels to support the training of conversational dense retrieval. To tackle this data scarcity problem, previous methods focus on developing better few-shot learning approaches or generating pseudo relevance labels, but the data they use for training still heavily rely on manual generation.In this paper, we present ConvTrans, a data augmentation method that can automatically transform easily-accessible web search sessions into conversational search sessions to fundamentally alleviate the data scarcity problem for conversational dense retrieval. ConvTrans eliminates the gaps between these two types of sessions in terms of session quality and query form to achieve effective session transformation. Extensive evaluations on two widely used conversational search benchmarks, i.e., CAsT-19 and CAsT-20, demonstrate that the same model trained on the data generated by ConvTrans can achieve comparable retrieval performance as it trained on high-quality but expensive artificial conversational search data.

2020

pdf bib
A Joint Multiple Criteria Model in Transfer Learning for Cross-domain Chinese Word Segmentation
Kaiyu Huang | Degen Huang | Zhuang Liu | Fengran Mo
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Word-level information is important in natural language processing (NLP), especially for the Chinese language due to its high linguistic complexity. Chinese word segmentation (CWS) is an essential task for Chinese downstream NLP tasks. Existing methods have already achieved a competitive performance for CWS on large-scale annotated corpora. However, the accuracy of the method will drop dramatically when it handles an unsegmented text with lots of out-of-vocabulary (OOV) words. In addition, there are many different segmentation criteria for addressing different requirements of downstream NLP tasks. Excessive amounts of models with saving different criteria will generate the explosive growth of the total parameters. To this end, we propose a joint multiple criteria model that shares all parameters to integrate different segmentation criteria into one model. Besides, we utilize a transfer learning method to improve the performance of OOV words. Our proposed method is evaluated by designing comprehensive experiments on multiple benchmark datasets (e.g., Bakeoff 2005, Bakeoff 2008 and SIGHAN 2010). Our method achieves the state-of-the-art performances on all datasets. Importantly, our method also shows a competitive practicability and generalization ability for the CWS task.