论文标题
在结构化输出空间中用于组合问题的一种绝对解决方案的神经学习
Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces
论文作者
论文摘要
最近的研究提出了用于解决结构化输出空间中组合问题的神经体系结构。在许多此类问题中,可能存在多种解决方案的给定输入,例如部分填充的Sudoku难题可能使所有限制都具有许多完成。此外,我们经常有兴趣找到任何可能的解决方案,而没有它们之间的任何偏好。现有方法完全忽略了该解决方案的多样性。在本文中,我们认为忽略多种解决方案的存在会严重阻碍其训练能力。我们的贡献是两个折。首先,我们正式定义了在结构化输出空间中学习一种合并问题的独特解决方案的任务,该解决方案适用于解决一些感兴趣的问题,例如N-Queens和Sudoku。其次,我们提出了一个通用的学习框架,该框架适应了现有的预测网络,以解决组合问题以处理解决方案多重性。我们的框架使用一个选择模块,其目标是为每个输入动态确定最有效的解决方案,该解决方案最有效地训练任何给定学习迭代中的网络参数。我们提出了一种基于RL的方法,可以通过预测网络共同训练选择模块。在三个不同的域上进行实验,并使用两个不同的预测网络表明,我们的框架显着提高了我们的设置中的准确性,比基线获得了多达21分的增益。
Recent research has proposed neural architectures for solving combinatorial problems in structured output spaces. In many such problems, there may exist multiple solutions for a given input, e.g. a partially filled Sudoku puzzle may have many completions satisfying all constraints. Further, we are often interested in finding any one of the possible solutions, without any preference between them. Existing approaches completely ignore this solution multiplicity. In this paper, we argue that being oblivious to the presence of multiple solutions can severely hamper their training ability. Our contribution is two fold. First, we formally define the task of learning one-of-many solutions for combinatorial problems in structured output spaces, which is applicable for solving several problems of interest such as N-Queens, and Sudoku. Second, we present a generic learning framework that adapts an existing prediction network for a combinatorial problem to handle solution multiplicity. Our framework uses a selection module, whose goal is to dynamically determine, for every input, the solution that is most effective for training the network parameters in any given learning iteration. We propose an RL based approach to jointly train the selection module with the prediction network. Experiments on three different domains, and using two different prediction networks, demonstrate that our framework significantly improves the accuracy in our setting, obtaining up to 21 pt gain over the baselines.