Haitao Mi


2024

pdf bib
Inconsistent dialogue responses and how to recover from them
Mian Zhang | Lifeng Jin | Linfeng Song | Haitao Mi | Dong Yu
Findings of the Association for Computational Linguistics: EACL 2024

One critical issue for chat systems is to stay consistent about preferences, opinions, beliefs and facts of itself, which has been shown a difficult problem. In this work, we study methods to assess and bolster utterance consistency of chat systems. A dataset is first developed for studying the inconsistencies, where inconsistent dialogue responses, explanations of the inconsistencies, and recovery utterances are authored by annotators. This covers the life span of inconsistencies, namely introduction, understanding, and resolution. Building on this, we introduce a set of tasks centered on dialogue consistency, specifically focused on its detection and resolution. Our experimental findings indicate that our dataset significantly helps the progress in identifying and resolving conversational inconsistencies, and current popular large language models like ChatGPT which are good at resolving inconsistencies however still struggle with detection.

pdf bib
Improving LLM Generations via Fine-Grained Self-Endorsement
Ante Wang | Linfeng Song | Baolin Peng | Lifeng Jin | Ye Tian | Haitao Mi | Jinsong Su | Dong Yu
Findings of the Association for Computational Linguistics ACL 2024

This work studies mitigating fact-conflicting hallucinations for large language model (LLM) at inference time.Particularly, we propose a self-endorsement framework that leverages the fine-grained fact-level comparisons across multiple sampled responses.Compared with prior ensemble methods (e.g., self-consistency) that perform response-level selection, our approach can better alleviate hallucinations for knowledge-intensive tasks.Our approach can broadly benefit smaller and open-source LLMs as it mainly conducts simple content-based comparisons.Experiments on Biographies show that our method can effectively improve the factuality of generations with simple and intuitive prompts across different scales of LLMs.Besides, comprehensive analyses on TriviaQA and GSM8K demonstrate the potential of self-endorsement for broader application.

pdf bib
A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation
Xiangci Li | Linfeng Song | Lifeng Jin | Haitao Mi | Jessica Ouyang | Dong Yu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Knowledge-based, open-domain dialogue generation aims to build chit-chat systems that talk to humans using mined support knowledge. Many types and sources of knowledge have previously been shown to be useful as support knowledge. Even in the era of large language models, response generation grounded in knowledge retrieved from additional up-to-date sources remains a practically important approach. While prior work using single-source knowledge has shown a clear positive correlation between the performances of knowledge selection and response generation, there are no existing multi-source datasets for evaluating support knowledge retrieval. Further, prior work has assumed that the knowledge sources available at test time are the same as during training. This unrealistic assumption unnecessarily handicaps models, as new knowledge sources can become available after a model is trained. In this paper, we present a high-quality benchmark named multi-source Wizard of Wikipedia (Ms.WoW) for evaluating multi-source dialogue knowledge selection and response generation. Unlike existing datasets, it contains clean support knowledge, grounded at the utterance level and partitioned into multiple knowledge sources. We further propose a new challenge, dialogue knowledge plug-and-play, which aims to test an already trained dialogue model on using new support knowledge from previously unseen sources in a zero-shot fashion.

pdf bib
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang | Baolin Peng | Ye Tian | Jingyan Zhou | Lifeng Jin | Linfeng Song | Haitao Mi | Helen Meng
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Despite showing impressive abilities, large language models (LLMs) often struggle with factual inaccuracies, i.e., ”hallucinations”, even when they hold relevant knowledge. To mitigate these hallucinations, current approaches typically necessitate high-quality human factuality annotations. In this work, we explore Self-Alignment for Factuality, where we leverage the self-evaluation capability of an LLM to provide training signals that steer the model towards factuality. Specifically, we incorporate Self-Eval, a self-evaluation component, to prompt an LLM to validate the factuality of its own generated responses solely based on its internal knowledge. Additionally, we design Self-Knowledge Tuning (SK-Tuning) to augment the LLM’s self-evaluation ability by improving the model’s confidence estimation and calibration. We then utilize these self-annotated responses to fine-tune the model via Direct Preference Optimization algorithm. We show that the proposed self-alignment approach substantially enhances factual accuracy over Llama family models across three key knowledge-intensive tasks on TruthfulQA and BioGEN.

2023

pdf bib
OpenFact: Factuality Enhanced Open Knowledge Extraction
Linfeng Song | Ante Wang | Xiaoman Pan | Hongming Zhang | Dian Yu | Lifeng Jin | Haitao Mi | Jinsong Su | Yue Zhang | Dong Yu
Transactions of the Association for Computational Linguistics, Volume 11

We focus on the factuality property during the extraction of an OpenIE corpus named OpenFact, which contains more than 12 million high-quality knowledge triplets. We break down the factuality property into two important aspects—expressiveness and groundedness—and we propose a comprehensive framework to handle both aspects. To enhance expressiveness, we formulate each knowledge piece in OpenFact based on a semantic frame. We also design templates, extra constraints, and adopt human efforts so that most OpenFact triplets contain enough details. For groundedness, we require the main arguments of each triplet to contain linked Wikidata1 entities. A human evaluation suggests that the OpenFact triplets are much more accurate and contain denser information compared to OPIEC-Linked (Gashteovski et al., 2019), one recent high-quality OpenIE corpus grounded to Wikidata. Further experiments on knowledge base completion and knowledge base question answering show the effectiveness of OpenFact over OPIEC-Linked as supplementary knowledge to Wikidata as the major KG.

pdf bib
Discover, Explain, Improve: An Automatic Slice Detection Benchmark for Natural Language Processing
Wenyue Hua | Lifeng Jin | Linfeng Song | Haitao Mi | Yongfeng Zhang | Dong Yu
Transactions of the Association for Computational Linguistics, Volume 11

Pretrained natural language processing (NLP) models have achieved high overall performance, but they still make systematic errors. Instead of manual error analysis, research on slice detection models (SDMs), which automatically identify underperforming groups of datapoints, has caught escalated attention in Computer Vision for both understanding model behaviors and providing insights for future model training and designing. However, little research on SDMs and quantitative evaluation of their effectiveness have been conducted on NLP tasks. Our paper fills the gap by proposing a benchmark named “Discover, Explain, Improve (DEIm)” for classification NLP tasks along with a new SDM Edisa. Edisa discovers coherent and underperforming groups of datapoints; DEIm then unites them under human-understandable concepts and provides comprehensive evaluation tasks and corresponding quantitative metrics. The evaluation in DEIm shows that Edisa can accurately select error-prone datapoints with informative semantic features that summarize error patterns. Detecting difficult datapoints directly boosts model performance without tuning any original model parameters, showing that discovered slices are actionable for users.1

pdf bib
SafeConv: Explaining and Correcting Conversational Unsafe Behavior
Mian Zhang | Lifeng Jin | Linfeng Song | Haitao Mi | Wenliang Chen | Dong Yu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

One of the main challenges open-domain end-to-end dialogue systems, or chatbots, face is the prevalence of unsafe behavior, such as toxic languages and harmful suggestions. However, existing dialogue datasets do not provide enough annotation to explain and correct such unsafe behavior. In this work, we construct a new dataset called SafeConv for the research of conversational safety: (1) Besides the utterance-level safety labels, SafeConv also provides unsafe spans in an utterance, information able to indicate which words contribute to the detected unsafe behavior; (2) SafeConv provides safe alternative responses to continue the conversation when unsafe behavior detected, guiding the conversation to a gentle trajectory. By virtue of the comprehensive annotation of SafeConv, we benchmark three powerful models for the mitigation of conversational unsafe behavior, including a checker to detect unsafe utterances, a tagger to extract unsafe spans, and a rewriter to convert an unsafe response to a safe version. Moreover, we explore the huge benefits brought by combining the models for explaining the emergence of unsafe behavior and detoxifying chatbots. Experiments show that the detected unsafe behavior could be well explained with unsafe spans and popular chatbots could be detoxified by a huge extent. The dataset is available at https://github.com/mianzhang/SafeConv.

pdf bib
Bi-level Finetuning with Task-dependent Similarity Structure for Low-resource Training
Sai Ashish Somayajula | Lifeng Jin | Linfeng Song | Haitao Mi | Dong Yu
Findings of the Association for Computational Linguistics: ACL 2023

Training a large language model in low-resource settings is challenging since they are susceptible to overfitting with limited generalization abilities. Previous work addresses this issue by approaches such as tunable parameters reduction or data augmentation. However, they either limit the trained models’ expressiveness or rely on task-independent knowledge. In this paper, we propose the Bi-level Finetuning with Task-dependent Similarity Structure framework where all parameters, including the embeddings for unseen tokens, are finetuned with task-dependent information from the training data only. In this framework, a task-dependent similarity structure is learned in a data-driven fashion, which in turn is used to compose soft embeddings from conventional embeddings to be used in training to update all parameters. In order to learn the similarity structure and model parameters, we propose a bi-level optimization algorithm with two stages—search and finetune—to ensure successful learning. Results of experiments on several classification datasets in low-resource scenarios demonstrate that models trained with our method outperform strong baselines. Ablation experiments further support the effectiveness of different components in our framework. Code is available at https://github.com/Sai-Ashish/BFTSS.

pdf bib
Friend-training: Learning from Models of Different but Related Tasks
Mian Zhang | Lifeng Jin | Linfeng Song | Haitao Mi | Xiabing Zhou | Dong Yu
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Current self-training methods such as standard self-training, co-training, tri-training, and others often focus on improving model performance on a single task, utilizing differences in input features, model architectures, and training processes. However, many tasks in natural language processing are about different but related aspects of language, and models trained for one task can be great teachers for other related tasks. In this work, we propose friend-training, a cross-task self-training framework, where models trained to do different tasks are used in an iterative training, pseudo-labeling, and retraining process to help each other for better selection of pseudo-labels. With two dialogue understanding tasks, conversational semantic role labeling and dialogue rewriting, chosen for a case study, we show that the models trained with the friend-training framework achieve the best performance compared to strong baselines.

pdf bib
More Than Spoken Words: Nonverbal Message Extraction and Generation
Dian Yu | Xiaoyang Wang | Wanshun Chen | Nan Du | Longyue Wang | Haitao Mi | Dong Yu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Nonverbal messages (NM) such as speakers’ facial expressions and speed of speech are essential for face-to-face communication, and they can be regarded as implicit knowledge as they are usually not included in existing dialogue understanding or generation tasks. This paper introduces the task of extracting NMs in written text and generating NMs for spoken text. Previous studies merely focus on extracting NMs from relatively small-scale well-structured corpora such as movie scripts wherein NMs are enclosed in parentheses by scriptwriters, which greatly decreases the difficulty of extraction. To enable extracting NMs from unstructured corpora, we annotate the first NM extraction dataset for Chinese based on novels and develop three baselines to extract single-span or multi-span NM of a target utterance from its surrounding context. Furthermore, we use the extractors to extract 749K (context, utterance, NM) triples from Chinese novels and investigate whether we can use them to improve NM generation via semi-supervised learning. Experimental results demonstrate that the automatically extracted triples can serve as high-quality augmentation data of clean triples extracted from scripts to generate more relevant, fluent, valid, and factually consistent NMs than the purely supervised generator, and the resulting generator can in turn help Chinese dialogue understanding tasks such as dialogue machine reading comprehension and emotion classification by simply adding the predicted “unspoken” NM to each utterance or narrative in inputs.

2022

pdf bib
Learning a Grammar Inducer from Massive Uncurated Instructional Videos
Songyang Zhang | Linfeng Song | Lifeng Jin | Haitao Mi | Kun Xu | Dong Yu | Jiebo Luo
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Video-aided grammar induction aims to leverage video information for finding more accurate syntactic grammars for accompanying text. While previous work focuses on building systems for inducing grammars on text that are well-aligned with video content, we investigate the scenario, in which text and video are only in loose correspondence. Such data can be found in abundance online, and the weak correspondence is similar to the indeterminacy problem studied in language acquisition. Furthermore, we build a new model that can better learn video-span correlation without manually designed features adopted by previous work. Experiments show that our model trained only on large-scale YouTube data with no text-video alignment reports strong and robust performances across three unseen datasets, despite domain shift and noisy label issues. Furthermore our model yields higher F1 scores than the previous state-of-the-art systems trained on in-domain data.

pdf bib
Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation
Xiang Hu | Haitao Mi | Liang Li | Gerard de Melo
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Chart-based models have shown great potential in unsupervised grammar induction, running recursively and hierarchically, but requiring O(n³) time-complexity. The Recursive Transformer based on Differentiable Trees (R2D2) makes it possible to scale to large language model pretraining even with a complex tree encoder, by introducing a heuristic pruning method.However, its rule-based pruning process suffers from local optima and slow inference. In this paper, we propose a unified R2D2 method that overcomes these issues. We use a top-down unsupervised parser as a model-guided pruning method, which also enables parallel encoding during inference. Our parser casts parsing as a split point scoring task by first scoring all split points for a given sentence and then using the highest-scoring one to recursively split a span into two parts. The reverse order of the splits is considered as the order of pruning in the encoder. We optimize the unsupervised parser by minimizing the Kullback–Leibler distance between tree probabilities from the parser and the R2D2 model.Our experiments show that our Fast-R2D2 significantly improves the grammar induction quality and achieves competitive results in downstream tasks.

pdf bib
Cross-lingual Text-to-SQL Semantic Parsing with Representation Mixup
Peng Shi | Linfeng Song | Lifeng Jin | Haitao Mi | He Bai | Jimmy Lin | Dong Yu
Findings of the Association for Computational Linguistics: EMNLP 2022

We focus on the cross-lingual Text-to-SQL semantic parsing task,where the parsers are expected to generate SQL for non-English utterances based on English database schemas.Intuitively, English translation as side information is an effective way to bridge the language gap,but noise introduced by the translation system may affect parser effectiveness.In this work, we propose a Representation Mixup Framework (Rex) for effectively exploiting translations in the cross-lingual Text-to-SQL task.Particularly, it uses a general encoding layer, a transition layer, and a target-centric layer to properly guide the information flow of the English translation.Experimental results on CSpider and VSpider show that our framework can benefit from cross-lingual training and improve the effectiveness of semantic parsers, achieving state-of-the-art performance.

2021

pdf bib
R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling
Xiang Hu | Haitao Mi | Zujie Wen | Yafang Wang | Yi Su | Jing Zheng | Gerard de Melo
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Human language understanding operates at multiple levels of granularity (e.g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined. However, existing deep models with stacked layers do not explicitly model any sort of hierarchical process. In this paper, we propose a recursive Transformer model based on differentiable CKY style binary trees to emulate this composition process, and we extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes. To scale up our approach, we also introduce an efficient pruning and growing algorithm to reduce the time complexity and enable encoding in linear time. Experimental results on language modeling and unsupervised parsing show the effectiveness of our approach.

pdf bib
A Dialogue-based Information Extraction System for Medical Insurance Assessment
Shuang Peng | Mengdi Zhou | Minghui Yang | Haitao Mi | Shaosheng Cao | Zujie Wen | Teng Xu | Hongbin Wang | Lei Liu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2016

pdf bib
Coverage Embedding Models for Neural Machine Translation
Haitao Mi | Baskaran Sankaran | Zhiguo Wang | Abe Ittycheriah
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Supervised Attentions for Neural Machine Translation
Haitao Mi | Zhiguo Wang | Abe Ittycheriah
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Sentence Similarity Learning by Lexical Decomposition and Composition
Zhiguo Wang | Haitao Mi | Abraham Ittycheriah
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Most conventional sentence similarity methods only focus on similar parts of two input sentences, and simply ignore the dissimilar parts, which usually give us some clues and semantic meanings about the sentences. In this work, we propose a model to take into account both the similarities and dissimilarities by decomposing and composing lexical semantics over sentences. The model represents each word as a vector, and calculates a semantic matching vector for each word based on all words in the other sentence. Then, each word vector is decomposed into a similar component and a dissimilar component based on the semantic matching vector. After this, a two-channel CNN model is employed to capture features by composing the similar and dissimilar components. Finally, a similarity score is estimated over the composed feature vectors. Experimental results show that our model gets the state-of-the-art performance on the answer sentence selection task, and achieves a comparable result on the paraphrase identification task.

pdf bib
Sense Embedding Learning for Word Sense Induction
Linfeng Song | Zhiguo Wang | Haitao Mi | Daniel Gildea
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics

pdf bib
Semi-supervised Clustering for Short Text via Deep Representation Learning
Zhiguo Wang | Haitao Mi | Abraham Ittycheriah
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning

pdf bib
Vocabulary Manipulation for Neural Machine Translation
Haitao Mi | Zhiguo Wang | Abe Ittycheriah
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Shift-Reduce Constituency Parsing with Dynamic Programming and POS Tag Lattice
Haitao Mi | Liang Huang
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Feature Optimization for Constituent Parsing via Neural Networks
Zhiguo Wang | Haitao Mi | Nianwen Xue
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
A Structured Language Model for Incremental Tree-to-String Translation
Heng Yu | Haitao Mi | Liang Huang | Qun Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Hierarchical MT Training using Max-Violation Perceptron
Kai Zhao | Liang Huang | Haitao Mi | Abe Ittycheriah
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding
Martin Čmejrek | Haitao Mi | Bowen Zhou
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Max-Violation Perceptron and Forced Decoding for Scalable MT Training
Heng Yu | Liang Huang | Haitao Mi | Kai Zhao
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2011

pdf bib
Rule Markov Models for Fast Tree-to-String Translation
Ashish Vaswani | Haitao Mi | Liang Huang | David Chiang
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Bagging-based System Combination for Domain Adaption
Linfeng Song | Haitao Mi | Yajuan Lü | Qun Liu
Proceedings of Machine Translation Summit XIII: Papers

pdf bib
A novel dependency-to-string model for statistical machine translation
Jun Xie | Haitao Mi | Qun Liu
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Efficient Incremental Decoding for Tree-to-String Translation
Liang Huang | Haitao Mi
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
The ICT statistical machine translation system for IWSLT 2010
Hao Xiong | Jun Xie | Hui Yu | Kai Liu | Wei Luo | Haitao Mi | Yang Liu | Yajuan Lü | Qun Liu
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
Constituency to Dependency Translation with Forests
Haitao Mi | Qun Liu
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Learning Lexicalized Reordering Models from Reordering Graphs
Jinsong Su | Yang Liu | Yajuan Lv | Haitao Mi | Qun Liu
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
Statistical Translation Model Based On Source Syntax Structure
Qun Liu | Yang Liu | Haitao Mi
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
An Efficient Shift-Reduce Decoding Algorithm for Phrased-Based Machine Translation
Yang Feng | Haitao Mi | Yang Liu | Qun Liu
Coling 2010: Posters

pdf bib
Machine Translation with Lattices and Forests
Haitao Mi | Liang Huang | Qun Liu
Coling 2010: Posters

pdf bib
Dependency-Based Bracketing Transduction Grammar for Statistical Machine Translation
Jinsong Su | Yang Liu | Haitao Mi | Hongmei Zhao | Yajuan Lv | Qun Liu
Coling 2010: Posters

2009

pdf bib
The ICT statistical machine translation system for the IWSLT 2009
Haitao Mi | Yang Li | Tian Xia | Xinyan Xiao | Yang Feng | Jun Xie | Hao Xiong | Zhaopeng Tu | Daqi Zheng | Yanjuan Lu | Qun Liu
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the ICT Statistical Machine Translation systems that used in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2009. For this year’s evaluation, we participated in the Challenge Task (Chinese-English and English-Chinese) and BTEC Task (Chinese-English). And we mainly focus on one new method to improve single system’s translation quality. Specifically, we developed a sentence-similarity based development set selection technique. For each task, we finally submitted the single system who got the maximum BLEU scores on the selected development set. The four single translation systems are based on different techniques: a linguistically syntax-based system, two formally syntax-based systems and a phrase-based system. Typically, we didn’t use any rescoring or system combination techniques in this year’s evaluation.

pdf bib
Lattice-based System Combination for Statistical Machine Translation
Yang Feng | Yang Liu | Haitao Mi | Qun Liu | Yajuan Lü
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Decoding with Multiple Translation Models
Yang Liu | Haitao Mi | Yang Feng | Qun Liu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Sub-Sentence Division for Tree-Based Machine Translation
Hao Xiong | Wenwen Xu | Haitao Mi | Yang Liu | Qun Liu
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
Forest-based Translation Rule Extraction
Haitao Mi | Liang Huang
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Forest-Based Translation
Haitao Mi | Liang Huang | Qun Liu
Proceedings of ACL-08: HLT

pdf bib
Refinements in BTG-based Statistical Machine Translation
Deyi Xiong | Min Zhang | AiTi Aw | Haitao Mi | Qun Liu | Shouxun Lin
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Word Lattice Reranking for Chinese Word Segmentation and Part-of-Speech Tagging
Wenbin Jiang | Haitao Mi | Qun Liu
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
The ICT system description for IWSLT 2008.
Yang Liu | Zhongjun He | Haitao Mi | Yun Huang | Yang Feng | Wenbin Jiang | Yajuan Lu | Qun Liu
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper presents a description for the ICT systems involved in the IWSLT 2008 evaluation campaign. This year, we participated in Chinese-English and English-Chinese translation directions. Four statistical machine translation systems were used: one linguistically syntax-based, two formally syntax-based, and one phrase-based. The outputs of the four SMT systems were fed to a sentence-level system combiner, which was expected to produce better translations than single systems. We will report the results of the four single systems and the combiner on both the development and test sets.

2007

pdf bib
The ICT statistical machine translation systems for IWSLT 2007
Zhongjun He | Haitao Mi | Yang Liu | Deyi Xiong | Weihua Luo | Yun Huang | Zhixiang Ren | Yajuan Lu | Qun Liu
Proceedings of the Fourth International Workshop on Spoken Language Translation

In this paper, we give an overview of the ICT statistical machine translation systems for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2007. In this year’s evaluation, we participated in the Chinese-English transcript translation task, and developed three systems based on different techniques: a formally syntax-based system Bruin, an extended phrase-based system Confucius and a linguistically syntax-based system Lynx. We will describe the models of these three systems, and compare their performance in detail. We set Bruin as our primary system, which ranks 2 among the 15 primary results according to the official evaluation results.