一级文本异常检测的自我监督损失

论文标题

一级文本异常检测的自我监督损失

Self-Supervised Losses for One-Class Textual Anomaly Detection

论文作者

Mai, Kimberly T., Davies, Toby, Griffin, Lewis D.

论文摘要

目前在文本中进行异常检测的深度学习方法取决于可能无法获得的嵌入式的监督信号或难以调整的定制体系结构。我们研究了一个更简单的替代方法：具有自我监督目标的近距离数据上的微调变压器，并将损失用作异常得分。总体而言，自学方法在各种异常检测方案下的其他方法优于其他方法，将语义异常的AUROC得分提高了11.6％，而句法异常平均提高了22.8％。另外，最佳目标和结果学的表示取决于下游异常的类型。异常和插入物的可分离性信号表明表示形式更有效地检测语义异常，而狭窄特征方向的存在信号表示有效检测句法异常。

Current deep learning methods for anomaly detection in text rely on supervisory signals in inliers that may be unobtainable or bespoke architectures that are difficult to tune. We study a simpler alternative: fine-tuning Transformers on the inlier data with self-supervised objectives and using the losses as an anomaly score. Overall, the self-supervision approach outperforms other methods under various anomaly detection scenarios, improving the AUROC score on semantic anomalies by 11.6% and on syntactic anomalies by 22.8% on average. Additionally, the optimal objective and resultant learnt representation depend on the type of downstream anomaly. The separability of anomalies and inliers signals that a representation is more effective for detecting semantic anomalies, whilst the presence of narrow feature directions signals a representation that is effective for detecting syntactic anomalies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题