论文标题

部分可观测时空混沌系统的无模型预测

A sequence-to-sequence approach for document-level relation extraction

论文作者

Giorgi, John, Bader, Gary D., Wang, Bo

论文摘要

由于许多关系越过句子边界的动机,对文档级别的关系提取(DOCRE)的兴趣越来越大。 DOCRE需要在句子内和跨句子内整合信息,以捕获实体提及之间的复杂互动。大多数现有方法是基于管道的,需要实体作为输入。但是,由于共享参数和培训步骤,共同学习提取实体和关系可以提高性能,并提高效率。在本文中,我们开发了一种序列对序列方法SEQ2REL,可以学习DOCRE的子任务(实体提取,核心分辨率和关系提取)端到端,以取代特定于任务的组件的管道。使用我们称为实体暗示的简单策略,我们将基于流行的生物医学数据集的现有基于管道的方法进行比较,在某些情况下,我们的性能超过了其性能。我们还报告了这些数据集上的第一个端到端结果,以进行将来的比较。最后,我们证明,在我们的模型下,端到端方法的表现优于基于管道的方法。我们的代码,数据和训练有素的模型可在{\ url {https://github.com/johngiorgi/seq2rel}}上获得。可以在{\ url {https://share.streamlit.io/johngiorgi/seq2rel/main/main/demo.py}}上获得在线演示。

Motivated by the fact that many relations cross the sentence boundary, there has been increasing interest in document-level relation extraction (DocRE). DocRE requires integrating information within and across sentences, capturing complex interactions between mentions of entities. Most existing methods are pipeline-based, requiring entities as input. However, jointly learning to extract entities and relations can improve performance and be more efficient due to shared parameters and training steps. In this paper, we develop a sequence-to-sequence approach, seq2rel, that can learn the subtasks of DocRE (entity extraction, coreference resolution and relation extraction) end-to-end, replacing a pipeline of task-specific components. Using a simple strategy we call entity hinting, we compare our approach to existing pipeline-based methods on several popular biomedical datasets, in some cases exceeding their performance. We also report the first end-to-end results on these datasets for future comparison. Finally, we demonstrate that, under our model, an end-to-end approach outperforms a pipeline-based approach. Our code, data and trained models are available at {\url{https://github.com/johngiorgi/seq2rel}}. An online demo is available at {\url{https://share.streamlit.io/johngiorgi/seq2rel/main/demo.py}}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源