论文标题
公制统计:探索和推断带有距离轮廓的随机对象
Metric Statistics: Exploration and Inference for Random Objects With Distance Profiles
论文作者
论文摘要
本文概述了现代数据分析中越来越多地遇到的复杂数据的统计建模。有人认为,这些数据通常可以描述为满足某些结构条件并具有概率度量的度量空间的要素。我们将随机对象等空间的随机元素以及将其统计分析作为度量统计的新兴领域提及。公制统计信息提供了统计描述,变异,中心性和分位数的量化,回归和对随机对象种群的推断的方法,理论和可视化工具,从可用的数据和样本中推断出这些数量。除了对当前概念的简要审查外,我们还将距离曲线作为对象数据的主要工具,并结合了基础一维距离分布的成对瓦斯汀运输。这些成对的运输导致了传输等级和运输分位数以及两样本推断的直观和可解释概念的定义。关联的配置文件公制对象空间的原始指标,并可能在数据分析中揭示对象数据的重要特征。我们通过各种示例和可视化证明了这些工具,用于分析复杂数据。
This article provides an overview on the statistical modeling of complex data as increasingly encountered in modern data analysis. It is argued that such data can often be described as elements of a metric space that satisfies certain structural conditions and features a probability measure. We refer to the random elements of such spaces as random objects and to the emerging field that deals with their statistical analysis as metric statistics. Metric statistics provides methodology, theory and visualization tools for the statistical description, quantification of variation, centrality and quantiles, regression and inference for populations of random objects, inferring these quantities from available data and samples. In addition to a brief review of current concepts, we focus on distance profiles as a major tool for object data in conjunction with the pairwise Wasserstein transports of the underlying one-dimensional distance distributions. These pairwise transports lead to the definition of intuitive and interpretable notions of transport ranks and transport quantiles as well as two-sample inference. An associated profile metric complements the original metric of the object space and may reveal important features of the object data in data analysis. We demonstrate these tools for the analysis of complex data through various examples and visualizations.