论文标题

正式健壮的时间序列距离度量

A Formally Robust Time Series Distance Metric

论文作者

Toller, Maximilian, Geiger, Bernhard C., Kern, Roman

论文摘要

基于距离的分类是时间序列数据最具竞争力的分类方法之一。基于距离的分类的最关键组成部分是所选距离函数。过去的研究提出了各种不同的距离指标或专门针对现实世界时间序列数据的特定方面的措施,但是到目前为止尚未考虑一个重要方面:针对任意数据污染的鲁棒性。在这项工作中,我们提出了一个新颖的距离度量,该指标可抵抗任意“不良”污染,并且具有$ \ Mathcal {O}(n \ log n)$的最差计算复杂性。我们正式地争论为什么我们提出的指标是可靠的,并在经验评估中证明,当在k-neart最邻居的时间序列分类中应用时,该度量会产生竞争性分类精度。

Distance-based classification is among the most competitive classification methods for time series data. The most critical component of distance-based classification is the selected distance function. Past research has proposed various different distance metrics or measures dedicated to particular aspects of real-world time series data, yet there is an important aspect that has not been considered so far: Robustness against arbitrary data contamination. In this work, we propose a novel distance metric that is robust against arbitrarily "bad" contamination and has a worst-case computational complexity of $\mathcal{O}(n\log n)$. We formally argue why our proposed metric is robust, and demonstrate in an empirical evaluation that the metric yields competitive classification accuracy when applied in k-Nearest Neighbor time series classification.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源