Ruiji Fu


2024

pdf bib
Decompose, Prioritize, and Eliminate: Dynamically Integrating Diverse Representations for Multimodal Named Entity Recognition
Zihao Zheng | Zihan Zhang | Zexin Wang | Ruiji Fu | Ming Liu | Zhongyuan Wang | Bing Qin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Multi-modal Named Entity Recognition, a fundamental task for multi-modal knowledge graph construction, requires integrating multi-modal information to extract named entities from text. Previous research has explored the integration of multi-modal representations at different granularities. However, they struggle to integrate all these multi-modal representations to provide rich contextual information to improve multi-modal named entity recognition. In this paper, we propose DPE-MNER, which is an iterative reasoning framework that dynamically incorporates all the diverse multi-modal representations following the strategy of “decompose, prioritize, and eliminate”. Within the framework, the fusion of diverse multi-modal representations is decomposed into hierarchically connected fusion layers that are easier to handle. The incorporation of multi-modal information prioritizes transitioning from “easy-to-hard” and “coarse-to-fine”. The explicit modeling of cross-modal relevance eliminate the irrelevances that will mislead the MNER prediction. Extensive experiments on two public datasets have demonstrated the effectiveness of our approach.

2021

pdf bib
Verb Metaphor Detection via Contextual Relation Learning
Wei Song | Shuhui Zhou | Ruiji Fu | Ting Liu | Lizhen Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Correct natural language understanding requires computers to distinguish the literal and metaphorical senses of a word. Recent neu- ral models achieve progress on verb metaphor detection by viewing it as sequence labeling. In this paper, we argue that it is appropriate to view this task as relation classification between a verb and its various contexts. We propose the Metaphor-relation BERT (Mr-BERT) model, which explicitly models the relation between a verb and its grammatical, sentential and semantic contexts. We evaluate our method on the VUA, MOH-X and TroFi datasets. Our method gets competitive results compared with state-of-the-art approaches.

pdf bib
IFlyEA: A Chinese Essay Assessment System with Automated Rating, Review Generation, and Recommendation
Jiefu Gong | Xiao Hu | Wei Song | Ruiji Fu | Zhichao Sheng | Bo Zhu | Shijin Wang | Ting Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

Automated Essay Assessment (AEA) aims to judge students’ writing proficiency in an automatic way. This paper presents a Chinese AEA system IFlyEssayAssess (IFlyEA), targeting on evaluating essays written by native Chinese students from primary and junior schools. IFlyEA provides multi-level and multi-dimension analytical modules for essay assessment. It has state-of-the-art grammar level analysis techniques, and also integrates components for rhetoric and discourse level analysis, which are important for evaluating native speakers’ writing ability, but still challenging and less studied in previous work. Based on the comprehensive analysis, IFlyEA provides application services for essay scoring, review generation, recommendation, and explainable analytical visualization. These services can benefit both teachers and students during the process of writing teaching and learning.

2020

pdf bib
Combining ResNet and Transformer for Chinese Grammatical Error Diagnosis
Shaolei Wang | Baoxin Wang | Jiefu Gong | Zhongyuan Wang | Xiao Hu | Xingyi Duan | Zizhuo Shen | Gang Yue | Ruiji Fu | Dayong Wu | Wanxiang Che | Shijin Wang | Guoping Hu | Ting Liu
Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications

Grammatical error diagnosis is an important task in natural language processing. This paper introduces our system at NLPTEA-2020 Task: Chinese Grammatical Error Diagnosis (CGED). CGED aims to diagnose four types of grammatical errors which are missing words (M), redundant words (R), bad word selection (S) and disordered words (W). Our system is built on the model of multi-layer bidirectional transformer encoder and ResNet is integrated into the encoder to improve the performance. We also explore two ensemble strategies including weighted averaging and stepwise ensemble selection from libraries of models to improve the performance of single model. In official evaluation, our system obtains the highest F1 scores at identification level and position level. We also recommend error corrections for specific error types and achieve the second highest F1 score at correction level.

pdf bib
Discourse Self-Attention for Discourse Element Identification in Argumentative Student Essays
Wei Song | Ziyao Song | Ruiji Fu | Lizhen Liu | Miaomiao Cheng | Ting Liu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

This paper proposes to adapt self-attention to discourse level for modeling discourse elements in argumentative student essays. Specifically, we focus on two issues. First, we propose structural sentence positional encodings to explicitly represent sentence positions. Second, we propose to use inter-sentence attentions to capture sentence interactions and enhance sentence representation. We conduct experiments on two datasets: a Chinese dataset and an English dataset. We find that (i) sentence positional encoding can lead to a large improvement for identifying discourse elements; (ii) a structural relative positional encoding of sentences shows to be most effective; (iii) inter-sentence attention vectors are useful as a kind of sentence representations for identifying discourse elements.

pdf bib
Multi-Stage Pre-training for Automated Chinese Essay Scoring
Wei Song | Kai Zhang | Ruiji Fu | Lizhen Liu | Ting Liu | Miaomiao Cheng
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

This paper proposes a pre-training based automated Chinese essay scoring method. The method involves three components: weakly supervised pre-training, supervised cross- prompt fine-tuning and supervised target- prompt fine-tuning. An essay scorer is first pre- trained on a large essay dataset covering diverse topics and with coarse ratings, i.e., good and poor, which are used as a kind of weak supervision. The pre-trained essay scorer would be further fine-tuned on previously rated es- says from existing prompts, which have the same score range with the target prompt and provide extra supervision. At last, the scorer is fine-tuned on the target-prompt training data. The evaluation on four prompts shows that this method can improve a state-of-the-art neural essay scorer in terms of effectiveness and domain adaptation ability, while in-depth analysis also reveals its limitations..

2018

pdf bib
Chinese Grammatical Error Diagnosis using Statistical and Prior Knowledge driven Features with Probabilistic Ensemble Enhancement
Ruiji Fu | Zhengqi Pei | Jiefu Gong | Wei Song | Dechuan Teng | Wanxiang Che | Shijin Wang | Guoping Hu | Ting Liu
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

This paper describes our system at NLPTEA-2018 Task #1: Chinese Grammatical Error Diagnosis. Grammatical Error Diagnosis is one of the most challenging NLP tasks, which is to locate grammar errors and tell error types. Our system is built on the model of bidirectional Long Short-Term Memory with a conditional random field layer (BiLSTM-CRF) but integrates with several new features. First, richer features are considered in the BiLSTM-CRF model; second, a probabilistic ensemble approach is adopted; third, Template Matcher are used during a post-processing to bring in human knowledge. In official evaluation, our system obtains the highest F1 scores at identifying error types and locating error positions, the second highest F1 score at sentence level error detection. We also recommend error corrections for specific error types and achieve the best F1 performance among all participants.

pdf bib
Neural Multitask Learning for Simile Recognition
Lizhen Liu | Xiao Hu | Wei Song | Ruiji Fu | Ting Liu | Guoping Hu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Simile is a special type of metaphor, where comparators such as like and as are used to compare two objects. Simile recognition is to recognize simile sentences and extract simile components, i.e., the tenor and the vehicle. This paper presents a study of simile recognition in Chinese. We construct an annotated corpus for this research, which consists of 11.3k sentences that contain a comparator. We propose a neural network framework for jointly optimizing three tasks: simile sentence classification, simile component extraction and language modeling. The experimental results show that the neural network based approaches can outperform all rule-based and feature-based baselines. Both simile sentence classification and simile component extraction can benefit from multitask learning. The former can be solved very well, while the latter is more difficult.

2017

pdf bib
Discourse Mode Identification in Essays
Wei Song | Dong Wang | Ruiji Fu | Lizhen Liu | Ting Liu | Guoping Hu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Discourse modes play an important role in writing composition and evaluation. This paper presents a study on the manual and automatic identification of narration,exposition, description, argument and emotion expressing sentences in narrative essays. We annotate a corpus to study the characteristics of discourse modes and describe a neural sequence labeling model for identification. Evaluation results show that discourse modes can be identified automatically with an average F1-score of 0.7. We further demonstrate that discourse modes can be used as features that improve automatic essay scoring (AES). The impacts of discourse modes for AES are also discussed.

2016

pdf bib
Learning to Identify Sentence Parallelism in Student Essays
Wei Song | Tong Liu | Ruiji Fu | Lizhen Liu | Hanshi Wang | Ting Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Parallelism is an important rhetorical device. We propose a machine learning approach for automated sentence parallelism identification in student essays. We build an essay dataset with sentence level parallelism annotated. We derive features by combining generalized word alignment strategies and the alignment measures between word sequences. The experimental results show that sentence parallelism can be effectively identified with a F1 score of 82% at pair-wise level and 72% at parallelism chunk level. Based on this approach, we automatically identify sentence parallelism in more than 2000 student essays and study the correlation between the use of sentence parallelism and the types and quality of essays.

pdf bib
Anecdote Recognition and Recommendation
Wei Song | Ruiji Fu | Lizhen Liu | Hanshi Wang | Ting Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We introduce a novel task Anecdote Recognition and Recommendation. An anecdote is a story with a point revealing account of an individual person. Recommending proper anecdotes can be used as evidence to support argumentative writing or as a clue for further reading. We represent an anecdote as a structured tuple — < person, story, implication >. Anecdote recognition runs on archived argumentative essays. We extract narratives containing events of a person as the anecdote story. More importantly, we uncover the anecdote implication, which reveals the meaning and topic of an anecdote. Our approach depends on discourse role identification. Discourse roles such as thesis, main ideas and support help us locate stories and their implications in essays. The experiments show that informative and interpretable anecdotes can be recognized. These anecdotes are used for anecdote recommendation. The anecdote recommender can recommend proper anecdotes in response to given topics. The anecdote implication contributes most for bridging user interested topics and relevant anecdotes.

2015

pdf bib
Discourse Element Identification in Student Essays based on Global and Local Cohesion
Wei Song | Ruiji Fu | Lizhen Liu | Ting Liu
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib
Learning Semantic Hierarchies via Word Embeddings
Ruiji Fu | Jiang Guo | Bing Qin | Wanxiang Che | Haifeng Wang | Ting Liu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

pdf bib
Exploiting Multiple Sources for Open-Domain Hypernym Discovery
Ruiji Fu | Bing Qin | Ting Liu
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2011

pdf bib
Generating Chinese Named Entity Data from a Parallel Corpus
Ruiji Fu | Bing Qin | Ting Liu
Proceedings of 5th International Joint Conference on Natural Language Processing