学习从多个选项中选择

论文标题

学习从多个选项中选择

Learning to Select from Multiple Options

论文作者

Du, Jiangshu, Yin, Wenpeng, Xia, Congying, Yu, Philip S.

论文摘要

许多NLP任务可以被视为从一组选项中的选择问题，例如分类任务，多选择性问题答案等。文本核心（TE）已显示为处理这些选择问题的最新方法（SOTA）方法。 TE将输入文本视为前提（P），选项为假设（H），然后通过成对建模（P，H）来处理选择问题。两个局限性：首先，成对建模不知道其他选择，这是直观的，因为人类通常通过比较竞争候选人来确定最佳选择。其次，成对TE的推理过程是耗时的，尤其是当选项空间较大时。为了处理这两个问题，这项工作首先提出了一个上下文化的TE模型（上下文-TE），将其他K选项作为当前（P，H）建模的上下文。 Context-TE能够学习H的更可靠决定，因为它考虑了各种上下文。其次，我们通过提出并行-TE来加快上下文 - TE加速，这同时学习了多个选项的决策。并行-TE显着提高了推理速度，同时保持上下文-TE的可比性。我们的方法对三个任务（超细实体键入，意图检测和多选质量质量质量检查）进行了评估，它们是具有不同选项尺寸的选择问题。实验表明我们的模型设定了新的SOTA性能；特别是，在推理中，平行-TE的速度比K TIME的成对TE更快。我们的代码可在https://github.com/jiangshdd/learningtoselect上公开获取。

Many NLP tasks can be regarded as a selection problem from a set of options, such as classification tasks, multi-choice question answering, etc. Textual entailment (TE) has been shown as the state-of-the-art (SOTA) approach to dealing with those selection problems. TE treats input texts as premises (P), options as hypotheses (H), then handles the selection problem by modeling (P, H) pairwise. Two limitations: first, the pairwise modeling is unaware of other options, which is less intuitive since humans often determine the best options by comparing competing candidates; second, the inference process of pairwise TE is time-consuming, especially when the option space is large. To deal with the two issues, this work first proposes a contextualized TE model (Context-TE) by appending other k options as the context of the current (P, H) modeling. Context-TE is able to learn more reliable decision for the H since it considers various context. Second, we speed up Context-TE by coming up with Parallel-TE, which learns the decisions of multiple options simultaneously. Parallel-TE significantly improves the inference speed while keeping comparable performance with Context-TE. Our methods are evaluated on three tasks (ultra-fine entity typing, intent detection and multi-choice QA) that are typical selection problems with different sizes of options. Experiments show our models set new SOTA performance; particularly, Parallel-TE is faster than the pairwise TE by k times in inference. Our code is publicly available at https://github.com/jiangshdd/LearningToSelect.

下载PDF全文

下载文献需遵守相关版权规定

论文标题