序列的可区分分割

论文标题

序列的可区分分割

Differentiable Segmentation of Sequences

论文作者

Scharwächter, Erik, Lennartz, Jonathan, Müller, Emmanuel

论文摘要

分段模型被广泛用于描述具有离散更改点的非平稳顺序数据。他们的估计通常需要解决混合离散连续的优化问题，其中分割是离散的部分，并且所有其他模型参数都是连续的。已经开发了许多用于其特定模型假设的估计算法。对非标准算法的依赖性使得很难将分段模型集成到最先进的深度学习体系结构中，这些模型严重依赖于基于梯度的优化技术。在这项工作中，我们制定了分段模型的放松变体，该变体可以通过梯度下降对所有模型参数（包括分割）进行联合估计。我们以学习连续翘曲功能的最新进展为基础，并提出了一个基于双面力量（TSP）分布的新型翘曲功能。基于TSP的翘曲功能是可区分的，具有简单的封闭形式表达式，并且可以准确表示分段函数。我们的公式包括一类分割的广义线性模型作为一种特殊情况，这使其具有高度的用途。我们使用我们的方法对Covid-19与Poisson回归的传播建模，将其应用于更改点检测任务，并以概念漂移学习分类模型。实验表明，我们的方法通过标准算法有效地学习了所有这些任务的梯度下降。

Segmented models are widely used to describe non-stationary sequential data with discrete change points. Their estimation usually requires solving a mixed discrete-continuous optimization problem, where the segmentation is the discrete part and all other model parameters are continuous. A number of estimation algorithms have been developed that are highly specialized for their specific model assumptions. The dependence on non-standard algorithms makes it hard to integrate segmented models in state-of-the-art deep learning architectures that critically depend on gradient-based optimization techniques. In this work, we formulate a relaxed variant of segmented models that enables joint estimation of all model parameters, including the segmentation, with gradient descent. We build on recent advances in learning continuous warping functions and propose a novel family of warping functions based on the two-sided power (TSP) distribution. TSP-based warping functions are differentiable, have simple closed-form expressions, and can represent segmentation functions exactly. Our formulation includes the important class of segmented generalized linear models as a special case, which makes it highly versatile. We use our approach to model the spread of COVID-19 with Poisson regression, apply it on a change point detection task, and learn classification models with concept drift. The experiments show that our approach effectively learns all these tasks with standard algorithms for gradient descent.

下载PDF全文

下载文献需遵守相关版权规定

论文标题