论文标题

上下文感知的漂移检测

Context-Aware Drift Detection

论文作者

Cobb, Oliver, Van Looveren, Arnaud

论文摘要

监视机器学习系统时,均匀性的两样本测试构成了现有的漂移检测构建方法的基础。它们用于测试证据表明,最近部署数据的分布与历史参考数据的基础数据不同。但是,通常,诸如时间诱导的相关性等各种因素意味着,预计最近的部署数据不会形成I.I.D.来自历史数据分布的样本。取而代之的是,我们可能希望测试允许更改的\ textit {context}条件上的分布差异。为了促进这一点,我们从因果推理领域借用机械,以开发出更通用的漂移检测框架,建立在有条件分布治疗效果的两样本测试的基础上。我们建议基于最大条件平均差异的框架特定实例化。然后,我们提供了一项实证研究,证明了其对实际感兴趣的各种漂移检测问题的有效性,例如以对其各自的流行不敏感的方式检测数据基础分布的漂移。该研究还证明了对成像网尺度视力问题的适用性。

When monitoring machine learning systems, two-sample tests of homogeneity form the foundation upon which existing approaches to drift detection build. They are used to test for evidence that the distribution underlying recent deployment data differs from that underlying the historical reference data. Often, however, various factors such as time-induced correlation mean that batches of recent deployment data are not expected to form an i.i.d. sample from the historical data distribution. Instead we may wish to test for differences in the distributions conditional on \textit{context} that is permitted to change. To facilitate this we borrow machinery from the causal inference domain to develop a more general drift detection framework built upon a foundation of two-sample tests for conditional distributional treatment effects. We recommend a particular instantiation of the framework based on maximum conditional mean discrepancies. We then provide an empirical study demonstrating its effectiveness for various drift detection problems of practical interest, such as detecting drift in the distributions underlying subpopulations of data in a manner that is insensitive to their respective prevalences. The study additionally demonstrates applicability to ImageNet-scale vision problems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源