论文标题
序列建模的差分径向基函数网络
Differential radial basis function network for sequence modelling
论文作者
论文摘要
我们提出了称为RBF-DIFFNET的差分径向基函数(RBF)网络,其隐藏的层块是偏微分方程(PDES)线性的,以使基线RBF网络在顺序数据中对噪声稳健。假设顺序数据源自解决方案对基础PDE的离散化,则差异RBF网络了解PDE的恒定线性系数,从而通过遵循修改后的向后欧拉尔更新来正规化RBF网络。我们通过实验验证了Logistic Map Chaotic Chaotic Permeries以及Walmart在M5预测竞争中提供的30个现实世界中的差异RBF网络。将所提出的模型与多层感知器(MLP)和具有长短期内存(LSTM)块的多层感知器(MLP)(MLP)和复发网络的归一化和非均衡的RBF网络和合奏进行了比较。从实验结果中,RBF-DiffNet始终在预测误差方面显示了基线RBF网络的显着降低(例如,M5数据集中的根平方平均缩放率降低了26%); RBF-DIFFNET在LSTM计算时间的第十六章中还表现出与LSTM集合的可比性能。因此,我们提出的网络可以在序列建模任务中实现更准确的预测(在存在观察噪声的情况下),例如预测的时间表,以利用RBF网络的模型可解释性,快速训练和功能近似属性。
We propose a differential radial basis function (RBF) network termed RBF-DiffNet -- whose hidden layer blocks are partial differential equations (PDEs) linear in terms of the RBF -- to make the baseline RBF network robust to noise in sequential data. Assuming that the sequential data derives from the discretisation of the solution to an underlying PDE, the differential RBF network learns constant linear coefficients of the PDE, consequently regularising the RBF network by following modified backward-Euler updates. We experimentally validate the differential RBF network on the logistic map chaotic timeseries as well as on 30 real-world timeseries provided by Walmart in the M5 forecasting competition. The proposed model is compared with the normalised and unnormalised RBF networks, ARIMA, and ensembles of multilayer perceptrons (MLPs) and recurrent networks with long short-term memory (LSTM) blocks. From the experimental results, RBF-DiffNet consistently shows a marked reduction over the baseline RBF network in terms of the prediction error (e.g., 26% reduction in the root mean squared scaled error on the M5 dataset); RBF-DiffNet also shows a comparable performance to the LSTM ensemble at less than one-sixteenth the LSTM computational time. Our proposed network consequently enables more accurate predictions -- in the presence of observational noise -- in sequence modelling tasks such as timeseries forecasting that leverage the model interpretability, fast training, and function approximation properties of the RBF network.