论文标题

将数学问题转变为游戏:加强学习和Gröbner基础共同解决了整数可行性问题

Turning Mathematics Problems into Games: Reinforcement Learning and Gröbner bases together solve Integer Feasibility Problems

论文作者

Wu, Yue, De Loera, Jesús A.

论文摘要

可以通过玩游戏来训练代理商来回答困难的数学问题吗?我们考虑了整数可行性问题,这是决定线性方程和不平等系统是否具有具有整数值的解决方案的挑战。对于许多数学和计算机科学领域的应用,这是一个著名的NP完整问题。我们的论文描述了一个新颖的代数增强学习框架,该框架使代理商可以玩相当于整数可行性问题的游戏。我们解释了如何在一组固定保证金总和的阵列上将整数可行性问题转换为游戏。游戏从初始状态(数组)开始,并采取法律举动使利润率不变,我们的目标是最终以特定位置与零以的零状态达到胜利状态。要赢得比赛,玩家必须在初始状态和最终终端获胜状态之间找到一条路径。找到这样的获胜状态等同于解决整数可行性问题。关键代数成分是基础轴向转运多面体的曲线的基础。 Gröbner基础可以看作是游戏的一组连接动作(动作)。然后,我们提出了一种新型的RL方法,该方法训练一种代理,以预测连续空间中的移动,以应对较大的动作空间。然后将连续的移动投射到一组法律移动上,以使该路径始终导致有效状态。作为概念证明,我们在实验中证明了我们的代理商可以很好地发挥我们最简单的游戏版本,用于2向表。我们的工作突出了训练代理商通过当代机器学习方法来训练代理玩游戏的潜力来解决非平凡的数学查询的潜力。

Can agents be trained to answer difficult mathematical questions by playing a game? We consider the integer feasibility problem, a challenge of deciding whether a system of linear equations and inequalities has a solution with integer values. This is a famous NP-complete problem with applications in many areas of Mathematics and Computer Science. Our paper describes a novel algebraic reinforcement learning framework that allows an agent to play a game equivalent to the integer feasibility problem. We explain how to transform the integer feasibility problem into a game over a set of arrays with fixed margin sums. The game starts with an initial state (an array), and by applying a legal move that leaves the margins unchanged, we aim to eventually reach a winning state with zeros in specific positions. To win the game the player must find a path between the initial state and a final terminal winning state if one exists. Finding such a winning state is equivalent to solving the integer feasibility problem. The key algebraic ingredient is a Gröbner basis of the toric ideal for the underlying axial transportation polyhedron. The Gröbner basis can be seen as a set of connecting moves (actions) of the game. We then propose a novel RL approach that trains an agent to predict moves in continuous space to cope with the large size of action space. The continuous move is then projected onto the set of legal moves so that the path always leads to valid states. As a proof of concept we demonstrate in experiments that our agent can play well the simplest version of our game for 2-way tables. Our work highlights the potential to train agents to solve non-trivial mathematical queries through contemporary machine learning methods used to train agents to play games.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源