论文标题
Oodanalyzer:分布式样品的互动分析
OoDAnalyzer: Interactive Analysis of Out-of-Distribution Samples
论文作者
论文摘要
预测模型中绩效降低的一个主要原因是,训练数据没有很好地涵盖测试样本。这样的不是代表性的样本称为OOD样品。在本文中,我们提出了Oodanalyzer,这是一种视觉分析方法,用于互动地识别OOD样本并在上下文中解释它们。我们的方法集成了合奏OOD检测方法和基于网格的可视化。通过将更多特征与同一家族的算法相结合,从深层集合中改善了检测方法。为了更好地分析和理解上下文中的OOD样品,我们开发了一种新型的基于KNN的网格布局算法,该算法由Hall定理动机。该算法近似于最佳布局,并且具有$ O(kn^2)$时间复杂性,比整体性能最佳的网格布局算法快,但$ O(n^3)$ time复杂性。在几个数据集上进行了定量评估和案例研究,以证明卵形分析仪的有效性和实用性。
One major cause of performance degradation in predictive models is that the test samples are not well covered by the training data. Such not well-represented samples are called OoD samples. In this paper, we propose OoDAnalyzer, a visual analysis approach for interactively identifying OoD samples and explaining them in context. Our approach integrates an ensemble OoD detection method and a grid-based visualization. The detection method is improved from deep ensembles by combining more features with algorithms in the same family. To better analyze and understand the OoD samples in context, we have developed a novel kNN-based grid layout algorithm motivated by Hall's theorem. The algorithm approximates the optimal layout and has $O(kN^2)$ time complexity, faster than the grid layout algorithm with overall best performance but $O(N^3)$ time complexity. Quantitative evaluation and case studies were performed on several datasets to demonstrate the effectiveness and usefulness of OoDAnalyzer.