论文标题
高维推断的稀疏马尔可夫模型
Sparse Markov Models for High-dimensional Inference
论文作者
论文摘要
有限订单马尔可夫模型是依赖离散数据的理论上良好的模型。尽管它们的普遍性,但在大规模较大的情况下,在经验工作中的应用是罕见的。从业者避免使用高阶马尔可夫模型,因为(1)参数的数量随着顺序成倍增长,并且(2)解释通常很困难。引入了过渡分布模型(MTD)的混合物,以克服这两个局限性。 MTD代表高级马尔可夫模型作为单步马可比夫链的凸混合物,减少了参数的数量并增加了可解释性。然而,在实践中,由于维度和高算法复杂性的诅咒,对具有较大订单的MTD模型的估计仍然受到限制。在这里,我们证明,如果只有很少的滞后是相关的,那么我们可以一致,有效地恢复滞后并估算高维MTD模型的过渡概率。关键创新是选择模型相关滞后的递归程序。我们的结果基于(1)MTD的新结构结果和(2)改善的Martingale浓度不平等。我们使用模拟和天气数据说明了我们的方法。
Finite order Markov models are theoretically well-studied models for dependent discrete data. Despite their generality, application in empirical work when the order is large is rare. Practitioners avoid using higher order Markov models because (1) the number of parameters grow exponentially with the order and (2) the interpretation is often difficult. Mixture of transition distribution models (MTD) were introduced to overcome both limitations. MTD represent higher order Markov models as a convex mixture of single step Markov chains, reducing the number of parameters and increasing the interpretability. Nevertheless, in practice, estimation of MTD models with large orders are still limited because of curse of dimensionality and high algorithm complexity. Here, we prove that if only few lags are relevant we can consistently and efficiently recover the lags and estimate the transition probabilities of high-dimensional MTD models. The key innovation is a recursive procedure for the selection of the relevant lags of the model. Our results are based on (1) a new structural result of the MTD and (2) an improved martingale concentration inequality. We illustrate our method using simulations and a weather data.