论文标题
LSTM网络的树架构,用于顺序回归,缺少数据
A Tree Architecture of LSTM Networks for Sequential Regression with Missing Data
论文作者
论文摘要
我们研究了包含丢失样本的可变长度顺序数据的回归,并基于长短期内存(LSTM)网络引入了新的树架构。在我们的体系结构中,我们采用可变数量的LSTM网络,这些网络仅在序列中仅使用序列中的现有输入,与所有先前的方法不同,没有任何统计假设或丢失数据的归纳。特别是,我们通过基于一定数量的先前输入的“存在模式”选择这些LSTM网络的子集来结合丢失信息。从专家的角度来看,我们将不同的LSTM网络培训为我们的专家,以获取各种丢失模式,然后结合其输出以生成最终预测。我们还提供了所提出的体系结构的计算复杂性分析,该分析的序列长度是常规LSTM架构的复杂性的相同顺序。我们的方法很容易扩展到类似的结构,例如GRUS,如本文所述的RNN。在实验中,我们就著名的财务和现实生活数据集的最先进方法实现了重大的绩效改进。
We investigate regression for variable length sequential data containing missing samples and introduce a novel tree architecture based on the Long Short-Term Memory (LSTM) networks. In our architecture, we employ a variable number of LSTM networks, which use only the existing inputs in the sequence, in a tree-like architecture without any statistical assumptions or imputations on the missing data, unlike all the previous approaches. In particular, we incorporate the missingness information by selecting a subset of these LSTM networks based on "presence-pattern" of a certain number of previous inputs. From the mixture of experts perspective, we train different LSTM networks as our experts for various missingness patterns and then combine their outputs to generate the final prediction. We also provide the computational complexity analysis of the proposed architecture, which is in the same order of the complexity of the conventional LSTM architectures for the sequence length. Our method can be readily extended to similar structures such as GRUs, RNNs as remarked in the paper. In the experiments, we achieve significant performance improvements with respect to the state-of-the-art methods for the well-known financial and real life datasets.