论文标题
首先脱离趋势,然后参加:重新思考时间序列预测的关注
First De-Trend then Attend: Rethinking Attention for Time-Series Forecasting
论文作者
论文摘要
近年来,基于变形金刚的模型在长期预测中表现出了有希望的结果。除了在时域学习注意力外,最近的工作还探索了频域中的学习注意力(例如,傅立叶域,小波域),鉴于季节性模式可以在这些域中更好地捕获。在这项工作中,我们试图了解不同时间和频域中注意模型之间的关系。从理论上讲,我们表明,不同域中的注意力模型在线性条件下(即线性内核与注意力评分)是等效的。从经验上讲,我们通过各种季节性,趋势和噪声的各种合成实验来分析不同领域的注意模型如何显示出不同的行为,重点是其中的软性操作的作用。这些理论和经验分析都促使我们提出了一种新方法:TDFormer(趋势分解变压器),该方法首先应用了季节性趋势分解,然后加上将MLP结合在一起,该MLP与更傅立叶的关注预测趋势成分,以预测季节性成分,以预测最终的预测。基准时间序列预测数据集的广泛实验表明,TDFormer可以针对现有的基于注意力的模型实现最先进的性能。
Transformer-based models have gained large popularity and demonstrated promising results in long-term time-series forecasting in recent years. In addition to learning attention in time domain, recent works also explore learning attention in frequency domains (e.g., Fourier domain, wavelet domain), given that seasonal patterns can be better captured in these domains. In this work, we seek to understand the relationships between attention models in different time and frequency domains. Theoretically, we show that attention models in different domains are equivalent under linear conditions (i.e., linear kernel to attention scores). Empirically, we analyze how attention models of different domains show different behaviors through various synthetic experiments with seasonality, trend and noise, with emphasis on the role of softmax operation therein. Both these theoretical and empirical analyses motivate us to propose a new method: TDformer (Trend Decomposition Transformer), that first applies seasonal-trend decomposition, and then additively combines an MLP which predicts the trend component with Fourier attention which predicts the seasonal component to obtain the final prediction. Extensive experiments on benchmark time-series forecasting datasets demonstrate that TDformer achieves state-of-the-art performance against existing attention-based models.