论文标题
Hopfield网络就是您所需要的
Hopfield Networks is All You Need
论文作者
论文摘要
我们介绍了具有连续状态和相应更新规则的现代Hopfield网络。新的Hopfield网络可以将指数存储(具有关联空间的维度),通过一个更新来检索模式,并具有成倍小的检索错误。它具有三种类型的能量最小值(更新的固定点):(1)在所有模式上平均全局固定点,(2)在模式子集上平均的亚稳态状态,以及(3)存储单个模式的固定点。新的更新规则等同于变压器中使用的注意机制。这种等效性可以使变压器模型的头部表征。这些头部在第一层中表现最好,最好是全局平均,在高层中通过亚稳态平均。可以将新的现代Hopfield网络集成到深度学习体系结构中,以允许存储和访问原始输入数据,中间结果或学习的原型。这些Hopfield层可实现深度学习的新方法,超越了完全连接,卷积或经常性网络,并提供了汇总,记忆,关联和注意机制。我们证明了跨各个领域的Hopfield层的广泛适用性。霍普菲尔德层改善了四分之三的最新阶段,其中三分之三被认为是多个实例学习问题以及免疫曲目分类,其中数十万个实例。与不同的机器学习方法相比,Hopfield层通常在艰苦的学习方法的UCI基准集合中,在其中深度学习方法中,Hopfield层产生了新的最先进。最后,Hopfield层在两个药物设计数据集上实现了最先进的功能。该实现可在以下网址获得:https://github.com/ml-jku/hopfield-layers
We introduce a modern Hopfield network with continuous states and a corresponding update rule. The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors. It has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. The new update rule is equivalent to the attention mechanism used in transformers. This equivalence enables a characterization of the heads of transformer models. These heads perform in the first layers preferably global averaging and in higher layers partial averaging via metastable states. The new modern Hopfield network can be integrated into deep learning architectures as layers to allow the storage of and access to raw input data, intermediate results, or learned prototypes. These Hopfield layers enable new ways of deep learning, beyond fully-connected, convolutional, or recurrent networks, and provide pooling, memory, association, and attention mechanisms. We demonstrate the broad applicability of the Hopfield layers across various domains. Hopfield layers improved state-of-the-art on three out of four considered multiple instance learning problems as well as on immune repertoire classification with several hundreds of thousands of instances. On the UCI benchmark collections of small classification tasks, where deep learning methods typically struggle, Hopfield layers yielded a new state-of-the-art when compared to different machine learning methods. Finally, Hopfield layers achieved state-of-the-art on two drug design datasets. The implementation is available at: https://github.com/ml-jku/hopfield-layers