论文标题
重新审视了流形的自适应维度估计
Manifold-adaptive dimension estimation revisited
论文作者
论文摘要
数据维度向我们告知数据复杂性,并设置成功信号处理管道的结构。在这项工作中,我们重新访问并改善了歧管自适应的Farahmand-Szepesvári-Audibert(FSA)维度估计器,使其成为可用的最接近邻居尺寸估计器之一。如果局部歧管密度均匀,我们计算局部FSA估计值的概率密度函数。基于概率密度函数,我们建议将局部估计值的中位数用作内在维度的基本全局度量,并且我们证明了该渐近公正的估计器比先前提出的统计数据的优势:模式和平均值。此外,从概率密度函数中,如果I.I.D.持有。我们以指数校正公式来处理边缘和有限样本效应,并在HyperCube数据集上校准。我们将校正的中-FSA估计器与KNN估计器的性能进行比较:最大似然(ML,Levina-Bickel)和Danco的两个实现(R和MATLAB)。我们表明,根据平均百分比误差和误差率指标,校正后的-FSA估计器击败了ML估计器,并且与标准合成基准测试的Danco相等。使用中位-FSA算法,我们在静止状态和癫痫发作期间揭示了神经动力学的各种变化。我们确定具有较低维动力学的大脑区域,这些动态可能是癫痫发作区的可能因果关系和候选者。
Data dimensionality informs us about data complexity and sets limit on the structure of successful signal processing pipelines. In this work we revisit and improve the manifold-adaptive Farahmand-Szepesvári-Audibert (FSA) dimension estimator, making it one of the best nearest neighbor-based dimension estimators available. We compute the probability density function of local FSA estimates, if the local manifold density is uniform. Based on the probability density function, we propose to use the median of local estimates as a basic global measure of intrinsic dimensionality, and we demonstrate the advantages of this asymptotically unbiased estimator over the previously proposed statistics: the mode and the mean. Additionally, from the probability density function, we derive the maximum likelihood formula for global intrinsic dimensionality, if i.i.d. holds. We tackle edge and finite-sample effects with an exponential correction formula, calibrated on hypercube datasets. We compare the performance of the corrected-median-FSA estimator with kNN estimators: maximum likelihood (ML, Levina-Bickel) and two implementations of DANCo (R and matlab). We show that corrected-median-FSA estimator beats the ML estimator and it is on equal footing with DANCo for standard synthetic benchmarks according to mean percentage error and error rate metrics. With the median-FSA algorithm, we reveal diverse changes in the neural dynamics while resting state and during epileptic seizures. We identify brain areas with lower-dimensional dynamics that are possible causal sources and candidates for being seizure onset zones.