论文标题
利用预先训练的语言模型从文本中寻求对话信息
Leveraging pre-trained language models for conversational information seeking from text
论文作者
论文摘要
自然语言处理的最新进展,尤其是在构建非常大型的预训练语言表示模型方面,正在开辟有关构建对话信息寻求(CIS)系统的新观点。在本文中,我们调查了文化学习和预训练的语言表示模型的用法,以通过增量的问题和回答面向的方式从过程描述文档中提取信息问题。特别是,我们研究了本机GPT-3(生成预训练的变压器3)模型的用法,以及两个内在的学习自定义定制,这些自定义以几种射击学习方式注入概念定义和有限数量的样本。结果突出了该方法的潜力以及内在学习自定义的实用性,这可以实质上有助于解决基于深度学习的NLP技术的“培训数据挑战”。它还强调了控制流相关关系所带来的挑战,需要为此设计进一步的培训。
Recent advances in Natural Language Processing, and in particular on the construction of very large pre-trained language representation models, is opening up new perspectives on the construction of conversational information seeking (CIS) systems. In this paper we investigate the usage of in-context learning and pre-trained language representation models to address the problem of information extraction from process description documents, in an incremental question and answering oriented fashion. In particular we investigate the usage of the native GPT-3 (Generative Pre-trained Transformer 3) model, together with two in-context learning customizations that inject conceptual definitions and a limited number of samples in a few shot-learning fashion. The results highlight the potential of the approach and the usefulness of the in-context learning customizations, which can substantially contribute to address the "training data challenge" of deep learning based NLP techniques the BPM field. It also highlight the challenge posed by control flow relations for which further training needs to be devised.