Syntactic and Semantic-driven Learning for Open Information Extraction

Jialong Tang, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun, Xinyan Xiao, Hua Wu


Abstract
One of the biggest bottlenecks in building accurate, high coverage neural open IE systems is the need for large labelled corpora. The diversity of open domain corpora and the variety of natural language expressions further exacerbate this problem. In this paper, we propose a syntactic and semantic-driven learning approach, which can learn neural open IE models without any human-labelled data by leveraging syntactic and semantic knowledge as noisier, higher-level supervision. Specifically, we first employ syntactic patterns as data labelling functions and pretrain a base model using the generated labels. Then we propose a syntactic and semantic-driven reinforcement learning algorithm, which can effectively generalize the base model to open situations with high accuracy. Experimental results show that our approach significantly outperforms the supervised counterparts, and can even achieve competitive performance to supervised state-of-the-art (SoA) model.
Anthology ID:
2020.findings-emnlp.69
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
782–792
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.69
DOI:
10.18653/v1/2020.findings-emnlp.69
Bibkey:
Cite (ACL):
Jialong Tang, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun, Xinyan Xiao, and Hua Wu. 2020. Syntactic and Semantic-driven Learning for Open Information Extraction. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 782–792, Online. Association for Computational Linguistics.
Cite (Informal):
Syntactic and Semantic-driven Learning for Open Information Extraction (Tang et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.69.pdf
Optional supplementary material:
 2020.findings-emnlp.69.OptionalSupplementaryMaterial.zip
Code
 TangJiaLong/SSD-OpenIE
Data
OIE2016