论文标题
矩阵洗牌 - 交换网络,用于硬2D任务
Matrix Shuffle-Exchange Networks for Hard 2D Tasks
论文作者
论文摘要
卷积神经网络已成为处理二维数据的主要工具。它们在图像方面很好地工作,但是卷积的接受场有限,可以防止其对更复杂的2D任务的应用。我们提出了一种称为矩阵洗牌交换网络的新神经模型,该模型可以有效利用2D数据中的长距离依赖性,并且与卷积神经网络具有可比的速度。它源自神经抽搐 - 交换网络,具有$ \ MATHCAL {o}(\ log {n})$ layers和$ \ Mathcal {o}(N^2 \ log {n})$总的时间和空间复杂性,用于处理a $ n \ times n \ times n $ data n $ data matrix。我们表明,矩阵洗牌 - 交换网络非常适合矩阵和密集图上的算法和逻辑推理任务,超过了卷积和图神经网络基线。它的独特优势是,当概括到更大的实例时,可以保留完整的远程依赖模型 - 与配备有密集注意机制的模型可以处理的能力要大得多。
Convolutional neural networks have become the main tools for processing two-dimensional data. They work well for images, yet convolutions have a limited receptive field that prevents its applications to more complex 2D tasks. We propose a new neural model, called Matrix Shuffle-Exchange network, that can efficiently exploit long-range dependencies in 2D data and has comparable speed to a convolutional neural network. It is derived from Neural Shuffle-Exchange network and has $\mathcal{O}( \log{n})$ layers and $\mathcal{O}( n^2 \log{n})$ total time and space complexity for processing a $n \times n$ data matrix. We show that the Matrix Shuffle-Exchange network is well-suited for algorithmic and logical reasoning tasks on matrices and dense graphs, exceeding convolutional and graph neural network baselines. Its distinct advantage is the capability of retaining full long-range dependency modelling when generalizing to larger instances - much larger than could be processed with models equipped with a dense attention mechanism.