论文标题
MLMLM:链接预测与均值蒙版语言模型
MLMLM: Link Prediction with Mean Likelihood Masked Language Model
论文作者
论文摘要
知识库(KB)易于查询,可验证和可解释。但是,它们随着人工小时和高质量的数据而扩展。蒙版语言模型(MLMS),例如BERT,用计算能力以及非结构化的原始文本数据扩展。但是,这些模型中包含的知识是不可直接解释的。我们建议与MLMS执行链接预测,以解决KBS可伸缩性问题和MLMS的可解释性问题。为此,我们介绍了MLMLM,即均值的蒙版语言模型,一种方法,比较了生成不同实体以可易于处理的方式执行链接预测的平均可能性。我们在WN18RR数据集中获得了最新的ART(SOTA)结果,并在FB15K-237数据集上获得了最佳的非实体式结果。我们还获得了关于以前看不见的实体的链接预测的令人信服的结果,使MLMLM成为将新实体引入KB的合适方法。
Knowledge Bases (KBs) are easy to query, verifiable, and interpretable. They however scale with man-hours and high-quality data. Masked Language Models (MLMs), such as BERT, scale with computing power as well as unstructured raw text data. The knowledge contained within those models is however not directly interpretable. We propose to perform link prediction with MLMs to address both the KBs scalability issues and the MLMs interpretability issues. To do that we introduce MLMLM, Mean Likelihood Masked Language Model, an approach comparing the mean likelihood of generating the different entities to perform link prediction in a tractable manner. We obtain State of the Art (SotA) results on the WN18RR dataset and the best non-entity-embedding based results on the FB15k-237 dataset. We also obtain convincing results on link prediction on previously unseen entities, making MLMLM a suitable approach to introducing new entities to a KB.