论文标题

西班牙小吃:通过预训练弱监督的桌子解析

TAPAS: Weakly Supervised Table Parsing via Pre-training

论文作者

Herzig, Jonathan, Nowak, Paweł Krzysztof, Müller, Thomas, Piccinno, Francesco, Eisenschlos, Julian Martin

论文摘要

在表上回答自然语言问题通常被视为语义解析任务。为了减轻完整逻辑形式的收集成本,一种流行的方法着重于由词而不是逻辑形式组成的弱监督。但是,来自弱监督的训练语义解析器会带来困难,此外,生成的逻辑形式仅在检索该含义之前仅用作为中间步骤。在本文中,我们介绍了小吃,这是一种在不生成逻辑形式的情况下对表的问题回答的方法。小吃会从弱监管中训练,并通过选择表单元格并选择将相应的聚合操作员应用于此类选择来预测该表示。 TAPA扩展了Bert的体系结构以编码表作为输入,从有效的文本片段预先培训和从Wikipedia爬行的表格进行初始化,并经过训练有素的端到端。我们使用三个不同的语义解析数据集进行了实验,并发现TAPAS通过将SQA的最新准确性从55.1提高到67.2,并与Wikisql和Wikitq上最先进的ART相同,但使用更简单的模型架构,可以使SQA的最新准确性从55.1提高到67.2,均优于语义解析模型。我们还发现,从WikisQL到WikITQ在我们的环境中很微不足道的转移学习产生了48.7精度,比最新面前的时间高4.2分。

Answering natural language questions over tables is usually seen as a semantic parsing task. To alleviate the collection cost of full logical forms, one popular approach focuses on weak supervision consisting of denotations instead of logical forms. However, training semantic parsers from weak supervision poses difficulties, and in addition, the generated logical forms are only used as an intermediate step prior to retrieving the denotation. In this paper, we present TAPAS, an approach to question answering over tables without generating logical forms. TAPAS trains from weak supervision, and predicts the denotation by selecting table cells and optionally applying a corresponding aggregation operator to such selection. TAPAS extends BERT's architecture to encode tables as input, initializes from an effective joint pre-training of text segments and tables crawled from Wikipedia, and is trained end-to-end. We experiment with three different semantic parsing datasets, and find that TAPAS outperforms or rivals semantic parsing models by improving state-of-the-art accuracy on SQA from 55.1 to 67.2 and performing on par with the state-of-the-art on WIKISQL and WIKITQ, but with a simpler model architecture. We additionally find that transfer learning, which is trivial in our setting, from WIKISQL to WIKITQ, yields 48.7 accuracy, 4.2 points above the state-of-the-art.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源