论文标题
无监督的发现,控制和分离语义属性,并应用于异常检测
Unsupervised Discovery, Control, and Disentanglement of Semantic Attributes with Applications to Anomaly Detection
论文作者
论文摘要
我们的工作着重于无监督和生成的方法,这些方法解决以下目标:(a)学习无监督的生成代表,发现控制图像语义属性的潜在因素,(b)研究这种控制属性的能力如何正式地控制潜在因素删除的问题,阐明了相关的概念,但在过去的概念中阐明了与过去相关的概念,这些概念在过去且构成了综合的概念,这些方法是在过去的概念中,并且(c)在过去的概念中,并且(c)在(c)中遇到了综合的概念()(c)(c) (a)。对于(a),我们提出了一个网络体系结构,该网络体系结构利用了多尺度生成模型与共同信息(MI)最大化的组合。对于(b),我们得出了一个分析结果(引理1),该结果使两个相关但不同的概念具有清晰度:生成网络控制其产生的图像的语义属性的能力,即通过MI最大化产生的图像的能力,以及通过总相关性最小化获得的,从而消除潜在的潜在空间表示的能力。更具体地说,我们证明,最大化语义属性控制会促进潜在因素的分解。然后,使用引理1并在我们的损失功能中采用MI,然后从经验上表明,对于图像生成任务,与其他最先进的方法相比,所提出的方法表现出在质量和分解贸易空间中所测量的卓越性能,并且通过Frechet Intection距离(FID)评估质量(FID)(FID)(FID)(FID),以及通过相互信息信息进行质量评估。对于(c),我们设计了几种用于在(a)中学习的表示表示的异常检测系统,并与最先进的生成和歧视算法相比,证明了它们的性能优势。代表学习的上述贡献在解决计算机视觉中的其他重要问题(例如AI中的偏见和隐私)方面具有潜在的应用。
Our work focuses on unsupervised and generative methods that address the following goals: (a) learning unsupervised generative representations that discover latent factors controlling image semantic attributes, (b) studying how this ability to control attributes formally relates to the issue of latent factor disentanglement, clarifying related but dissimilar concepts that had been confounded in the past, and (c) developing anomaly detection methods that leverage representations learned in (a). For (a), we propose a network architecture that exploits the combination of multiscale generative models with mutual information (MI) maximization. For (b), we derive an analytical result (Lemma 1) that brings clarity to two related but distinct concepts: the ability of generative networks to control semantic attributes of images they generate, resulting from MI maximization, and the ability to disentangle latent space representations, obtained via total correlation minimization. More specifically, we demonstrate that maximizing semantic attribute control encourages disentanglement of latent factors. Using Lemma 1 and adopting MI in our loss function, we then show empirically that, for image generation tasks, the proposed approach exhibits superior performance as measured in the quality and disentanglement trade space, when compared to other state of the art methods, with quality assessed via the Frechet Inception Distance (FID), and disentanglement via mutual information gap. For (c), we design several systems for anomaly detection exploiting representations learned in (a), and demonstrate their performance benefits when compared to state-of-the-art generative and discriminative algorithms. The above contributions in representation learning have potential applications in addressing other important problems in computer vision, such as bias and privacy in AI.