论文标题

稀疏功能数据的深度概念

A notion of depth for sparse functional data

论文作者

Sguera, Carlo, López-Pintado, Sara

论文摘要

数据深度是用于分析功能数据的知名且有用的非参数工具。它提供了一种新颖的方式,可以从中心向外排名曲线样本并定义稳健的统计数据,例如中位数或修剪手段。它也已用作功能异常检测方法和分类的构件。在过去的几十年中,文献中引入了一些功能数据深度的概念。这些功能深度只能直接应用于在细和普通网格上测量的曲线样品。实际上,情况并非总是如此,并且在稀疏和主题的网格上经常观察到曲线。在这些情况下,通常的方法在于估计公共密集网格上的轨迹,并在深度分析中使用估计值。这种方法忽略了与曲线估计步骤相关的不确定性。我们的目标是扩展深度概念,以便考虑到这种不确定性。使用两个功能估计及其相关的置信区间,我们提出了一种新方法,该方法允许将曲线估计不确定性纳入深度分析。我们使用修改的频带深度描述了新方法,尽管可以使用任何其他功能深度。使用模拟曲线在我们控制稀疏程度的不同设置中使用模拟曲线说明了所提出的方法的性能。还考虑了由雌性medflies卵子轨迹组成的真实数据集。结果表明,在计算稀疏功能数据时使用不确定性的好处。

Data depth is a well-known and useful nonparametric tool for analyzing functional data. It provides a novel way of ranking a sample of curves from the center outwards and defining robust statistics, such as the median or trimmed means. It has also been used as a building block for functional outlier detection methods and classification. Several notions of depth for functional data were introduced in the literature in the last few decades. These functional depths can only be directly applied to samples of curves measured on a fine and common grid. In practice, this is not always the case, and curves are often observed at sparse and subject dependent grids. In these scenarios the usual approach consists in estimating the trajectories on a common dense grid, and using the estimates in the depth analysis. This approach ignores the uncertainty associated with the curves estimation step. Our goal is to extend the notion of depth so that it takes into account this uncertainty. Using both functional estimates and their associated confidence intervals, we propose a new method that allows the curve estimation uncertainty to be incorporated into the depth analysis. We describe the new approach using the modified band depth although any other functional depth could be used. The performance of the proposed methodology is illustrated using simulated curves in different settings where we control the degree of sparsity. Also a real data set consisting of female medflies egg-laying trajectories is considered. The results show the benefits of using uncertainty when computing depth for sparse functional data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源