匹配$^2 $：类似问题标识的匹配模型与匹配模型

论文标题

匹配$^2 $：类似问题标识的匹配模型与匹配模型

Match$^2$: A Matching over Matching Model for Similar Question Identification

论文作者

Wang, Zizhen, Fan, Yixing, Guo, Jiafeng, Yang, Liu, Zhang, Ruqing, Lan, Yanyan, Cheng, Xueqi, Jiang, Hui, Wang, Xiaozhao

论文摘要

社区问题回答（CQA）已成为人们获取知识的主要手段，人们可以自由提出问题或提交答案。为了提高服务的效率，类似的问题标识成为CQA中的核心任务，该任务旨在在提出新问题时从存档存储库中找到类似的问题。但是，长期以来，由于自然语言的固有差异，可以正确衡量两个问题之间的相似性，即可能有不同的方法来提出同一问题或共享相似表达式的不同问题。为了减轻这个问题，自然要涉及到归档问题的现有答案。传统方法通常采用一侧用法，该方法利用答案作为相应问题的一些扩展表示。不幸的是，由于答案通常漫长而多样化，导致性能较低，因此这可能会引起意外的声音。在这项工作中，我们提出了两侧用法，该用法利用答案是两个问题的桥梁。关键的想法是基于我们的观察结果，即答案的类似部分可以解决类似的问题，而不同的问题可能没有。换句话说，我们可以通过相同答案比较两个问题的匹配模式，以衡量它们的相似性。通过这种方式，我们提出了一种与匹配模型相比的新颖匹配，即匹配$^2 $，它比较了两个问答对之间的匹配模式以进行类似的问题识别。两个基准数据集的经验实验表明，我们的模型可以在类似的问题识别任务上显着优于先前的最新方法。

Community Question Answering (CQA) has become a primary means for people to acquire knowledge, where people are free to ask questions or submit answers. To enhance the efficiency of the service, similar question identification becomes a core task in CQA which aims to find a similar question from the archived repository whenever a new question is asked. However, it has long been a challenge to properly measure the similarity between two questions due to the inherent variation of natural language, i.e., there could be different ways to ask a same question or different questions sharing similar expressions. To alleviate this problem, it is natural to involve the existing answers for the enrichment of the archived questions. Traditional methods typically take a one-side usage, which leverages the answer as some expanded representation of the corresponding question. Unfortunately, this may introduce unexpected noises into the similarity computation since answers are often long and diverse, leading to inferior performance. In this work, we propose a two-side usage, which leverages the answer as a bridge of the two questions. The key idea is based on our observation that similar questions could be addressed by similar parts of the answer while different questions may not. In other words, we can compare the matching patterns of the two questions over the same answer to measure their similarity. In this way, we propose a novel matching over matching model, namely Match$^2$, which compares the matching patterns between two question-answer pairs for similar question identification. Empirical experiments on two benchmark datasets demonstrate that our model can significantly outperform previous state-of-the-art methods on the similar question identification task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题