论文标题

从文档级遥远的监督中提取的剥夺关系提取

Denoising Relation Extraction from Document-level Distant Supervision

论文作者

Xiao, Chaojun, Yao, Yuan, Xie, Ruobing, Han, Xu, Liu, Zhiyuan, Sun, Maosong, Lin, Fen, Lin, Leyu

论文摘要

远处监督(DS)已被广泛用于生成句子级别关系提取(RE)的自动标记数据,从而改善了RE性能。但是,DS的现有成功不能直接转移到更具挑战性的文档级别关系提取(DOCRE)上,因为DS中固有的噪声甚至可能在文档级别乘以倍增,并且会严重损害RE的性能。为了应对这一挑战,我们为DOCRE提出了一种新颖的预培训模型,该模型通过多个预训练任务来授予文档级DS数据。大规模DOCRE基准的实验结果表明,我们的模型可以从嘈杂的DS数据中捕获有用的信息并获得有希望的结果。

Distant supervision (DS) has been widely used to generate auto-labeled data for sentence-level relation extraction (RE), which improves RE performance. However, the existing success of DS cannot be directly transferred to the more challenging document-level relation extraction (DocRE), since the inherent noise in DS may be even multiplied in document level and significantly harm the performance of RE. To address this challenge, we propose a novel pre-trained model for DocRE, which denoises the document-level DS data via multiple pre-training tasks. Experimental results on the large-scale DocRE benchmark show that our model can capture useful information from noisy DS data and achieve promising results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源