论文标题
修订和重新提交:在同行评审中基于文本的协作的互文模型
Revise and Resubmit: An Intertextual Model of Text-based Collaboration in Peer Review
论文作者
论文摘要
同行评审是大多数科学领域发布过程的关键组成部分。提交率不断提高的费率造成了审查质量和效率的压力,激发了申请的开发,以支持审查和编辑工作。尽管现有的NLP研究集中于对单个文本的分析,但编辑援助通常需要对文本对之间的相互作用进行建模 - 但一般的框架和数据集以支持这种情况。文本之间的关系是互文理论的核心对象 - 尚未在NLP中运作的文学研究中的一种方法。受到先前理论工作的启发,我们提出了基于文本的协作的第一个互文模型,该模型涵盖了三种主要现象,这些现象构成了评论revise-Revise and Remubmit Cycle的完整迭代:务实标记,链接和较长的文档版本。尽管在科学和出版格式的领域使用同行评审,但现有数据集仅关注计算机科学中的会议式审查。在解决这个问题的情况下,我们在期刊风格的出版后开放式同行评审中实例化了我们提出的模型,并提供了有关互文注释的实际方面的详细见解。我们的资源是NLP在编辑支持同行评审中迈出的多域,细粒度应用的主要步骤,我们的互文框架为基于文本的协作的通用建模铺平了道路。我们的语料库和随附的代码公开可用。
Peer review is a key component of the publishing process in most fields of science. The increasing submission rates put a strain on reviewing quality and efficiency, motivating the development of applications to support the reviewing and editorial work. While existing NLP studies focus on the analysis of individual texts, editorial assistance often requires modeling interactions between pairs of texts -- yet general frameworks and datasets to support this scenario are missing. Relationships between texts are the core object of the intertextuality theory -- a family of approaches in literary studies not yet operationalized in NLP. Inspired by prior theoretical work, we propose the first intertextual model of text-based collaboration, which encompasses three major phenomena that make up a full iteration of the review-revise-and-resubmit cycle: pragmatic tagging, linking and long-document version alignment. While peer review is used across the fields of science and publication formats, existing datasets solely focus on conference-style review in computer science. Addressing this, we instantiate our proposed model in the first annotated multi-domain corpus in journal-style post-publication open peer review, and provide detailed insights into the practical aspects of intertextual annotation. Our resource is a major step towards multi-domain, fine-grained applications of NLP in editorial support for peer review, and our intertextual framework paves the path for general-purpose modeling of text-based collaboration. Our corpus and accompanying code are publicly available.