论文标题
从文档级遥远的监督中提取的剥夺关系提取
Denoising Relation Extraction from Document-level Distant Supervision
论文作者
论文摘要
远处监督(DS)已被广泛用于生成句子级别关系提取(RE)的自动标记数据,从而改善了RE性能。但是,DS的现有成功不能直接转移到更具挑战性的文档级别关系提取(DOCRE)上,因为DS中固有的噪声甚至可能在文档级别乘以倍增,并且会严重损害RE的性能。为了应对这一挑战,我们为DOCRE提出了一种新颖的预培训模型,该模型通过多个预训练任务来授予文档级DS数据。大规模DOCRE基准的实验结果表明,我们的模型可以从嘈杂的DS数据中捕获有用的信息并获得有希望的结果。
Distant supervision (DS) has been widely used to generate auto-labeled data for sentence-level relation extraction (RE), which improves RE performance. However, the existing success of DS cannot be directly transferred to the more challenging document-level relation extraction (DocRE), since the inherent noise in DS may be even multiplied in document level and significantly harm the performance of RE. To address this challenge, we propose a novel pre-trained model for DocRE, which denoises the document-level DS data via multiple pre-training tasks. Experimental results on the large-scale DocRE benchmark show that our model can capture useful information from noisy DS data and achieve promising results.