饲料向前诱导干扰训练的动力学

论文标题

饲料向前诱导干扰训练的动力学

Dynamics of feed forward induced interference training

论文作者

Tang, Shirui

论文摘要

随着背部传播的更新前eptron模型已成为深度学习的常规。为了使向后传播正常运行，需要连续的饲料前进过程。提出了一种新的培训方法，以怀疑基于变压器的模型（例如GPT）所带来的基本物理解释，以保持物理学的自洽性。通过将GPT模型视为一个时空图，然后跟踪信号的世界线，从而确定信号的可能路径，以便进行自我注意事件。通过轻微的修改，自我注意力可以看作是一种Ising模型相互作用，这使目标可以被设计为系统的能量。目标被视为诱导以磁性偶极子建模的诱导信号的外部磁场。概率网络旨在试点输入信号在不同的持续时间内穿过不同路线。设计更新概率的规则是为了在目标位置形成建设性干扰，以便可以最大化瞬时能量。对从MNIST提取的4类分类问题进行了实验。结果表现出有趣但预期的行为，在BP更新的网络中不存在，但更像是在真实的人类中学习，尤其是在几种镜头的情况下。

Preceptron model updating with back propagation has become the routine of deep learning. Continuous feed forward procedure is required in order for backward propagate to function properly. Doubting the underlying physical interpretation on transformer based models such as GPT brought about by the routine explaination, a new method of training is proposed in order to keep self-consistency of the physics. By treating the GPT model as a space-time diagram, and then trace the worldlines of signals, identifing the possible paths of signals in order fot a self-attention event to occure. With a slight modification, self-attention can be viewed as an ising model interaction, which enables the goal to be designed as energy of system. Target is treated as an external magnetic field inducing signals modeled as magnetic dipoles. A probability network is designed to pilot input signals travelling for different durations through different routes. A rule of updating the probabilities is designed in order to form constructive interference at target locations so that instantaneous energy can be maximised. Experiment was conducted on a 4-class classification problem extracted from MNIST. The results exhibit interesting but expected behavours, which do not exist in a bp updated network, but more like learning in a real human, especially in the few-shot scenario.

下载PDF全文

下载文献需遵守相关版权规定

论文标题