论文标题

异常的声音检测是一个简单的二进制分类问题,并仔细选择了代理异常值示例

Anomalous Sound Detection as a Simple Binary Classification Problem with Careful Selection of Proxy Outlier Examples

论文作者

Primus, Paul, Haunschmid, Verena, Praher, Patrick, Widmer, Gerhard

论文摘要

无监督的异常声音检测与识别偏离定义为“正常”的声音有关,而无需明确指定异常类型。一个重要的障碍是离群值的多样性和稀有性,通常会阻止我们收集一组代表性的异常声音。结果,大多数异常检测方法都使用无监督而不是监督的机器学习方法。然而,我们将证明,如果将一组异常的样本与我们称为代理异常值仔细替换,则可以有效地将异常的声音检测作为监督分类问题。代理异常值的候选人有很多既有正常声音也不是异常的录音。我们试验了2020年Dcase挑战的机器状况监视数据集,并找到具有匹配记录条件的代理异常值,并且与目标的高度相似性听起来特别有用。如果没有具有相似声音和匹配记录条件的数据,那么在这两个维度中具有较大多样性的数据集是可取的。我们的模型基于监督培训的代理异常值,在DCASE2020挑战的任务2中获得了第三名。

Unsupervised anomalous sound detection is concerned with identifying sounds that deviate from what is defined as 'normal', without explicitly specifying the types of anomalies. A significant obstacle is the diversity and rareness of outliers, which typically prevent us from collecting a representative set of anomalous sounds. As a consequence, most anomaly detection methods use unsupervised rather than supervised machine learning methods. Nevertheless, we will show that anomalous sound detection can be effectively framed as a supervised classification problem if the set of anomalous samples is carefully substituted with what we call proxy outliers. Candidates for proxy outliers are available in abundance as they potentially include all recordings that are neither normal nor abnormal sounds. We experiment with the machine condition monitoring data set of the 2020's DCASE Challenge and find proxy outliers with matching recording conditions and high similarity to the target sounds particularly informative. If no data with similar sounds and matching recording conditions is available, data sets with a larger diversity in these two dimensions are preferable. Our models based on supervised training with proxy outliers achieved rank three in Task 2 of the DCASE2020 Challenge.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源