立场预测和主张验证：阿拉伯语观点

论文标题

立场预测和主张验证：阿拉伯语观点

Stance Prediction and Claim Verification: An Arabic Perspective

论文作者

Khouja, Jude

论文摘要

这项工作探讨了使用阿拉伯语的新语料库在新闻索赔验证和立场预测中的应用。公开可用的语料库有两个观点：一个由4,547个真实声明和一个由3,786对组成的版本组成的版本（索赔，证据）。我们描述了创建语料库和注释过程的方法。使用引入的语料库，我们还为两个提出的任务开发了两个机器学习基准：主张验证和立场预测。我们的最佳模型利用预处理（BERT），并在立场预测任务上实现76.7 F1，在索赔验证任务上实现64.3 F1。我们的初步实验阐明了仅依赖于索赔文本的自动索赔验证的限制。结果暗示，虽然在预处理过程中学习的语言特征和世界知识对于立场预测有用，但从审前学到的这种熟悉的表示不足以验证主张而无需获得上下文或证据。

This work explores the application of textual entailment in news claim verification and stance prediction using a new corpus in Arabic. The publicly available corpus comes in two perspectives: a version consisting of 4,547 true and false claims and a version consisting of 3,786 pairs (claim, evidence). We describe the methodology for creating the corpus and the annotation process. Using the introduced corpus, we also develop two machine learning baselines for two proposed tasks: claim verification and stance prediction. Our best model utilizes pretraining (BERT) and achieves 76.7 F1 on the stance prediction task and 64.3 F1 on the claim verification task. Our preliminary experiments shed some light on the limits of automatic claim verification that relies on claims text only. Results hint that while the linguistic features and world knowledge learned during pretraining are useful for stance prediction, such learned representations from pretraining are insufficient for verifying claims without access to context or evidence.

下载PDF全文

下载文献需遵守相关版权规定

论文标题