有条件地生成医疗图像通过分解的对抗性推理

论文标题

有条件地生成医疗图像通过分解的对抗性推理

Conditional Generation of Medical Images via Disentangled Adversarial Inference

论文作者

Havaei, Mohammad, Mao, Ximeng, Wang, Yiping, Lao, Qicheng

论文摘要

合成医学图像生成具有巨大的潜力，可以通过许多应用来改善医疗保健，从培训机器学习系统的数据扩展到保护患者隐私。有条件的对抗生成网络（CGAN）使用条件因子来生成图像，并在近年来取得了巨大的成功。直观地，图像中的信息可以分为两个部分：1）通过调节向量提出的内容和2）样式，这是条件矢量中缺少的未发现的信息。当前使用CGAN进行医学图像生成的实践，仅将单个变量用于图像生成（即内容），因此不会提供太大的灵活性或对生成的图像的控制。在这项工作中，我们提出了一种从图像本身中学习的方法，可以解开样式和内容的表示形式，并使用此信息对生成过程进行控制。在此框架中，以完全无监督的方式学习样式，而通过监督学习（使用条件向量）和无监督的学习（具有推理机制）来学习内容。我们经历了两个新颖的正规化步骤，以确保内容式的分离。首先，我们通过引入梯度反向层（GRL）的新颖应用来最大程度地减少内容和样式之间的共享信息；其次，我们引入了一种自我监督的正则化方法，以进一步分开内容和样式变量中的信息。我们表明，通常，两个潜在变量模型可以实现更好的性能，并对生成的图像提供更多控制。我们还表明，我们提出的模型（DRAI）取得了最佳的分离得分，并且具有最佳的整体表现。

Synthetic medical image generation has a huge potential for improving healthcare through many applications, from data augmentation for training machine learning systems to preserving patient privacy. Conditional Adversarial Generative Networks (cGANs) use a conditioning factor to generate images and have shown great success in recent years. Intuitively, the information in an image can be divided into two parts: 1) content which is presented through the conditioning vector and 2) style which is the undiscovered information missing from the conditioning vector. Current practices in using cGANs for medical image generation, only use a single variable for image generation (i.e., content) and therefore, do not provide much flexibility nor control over the generated image. In this work we propose a methodology to learn from the image itself, disentangled representations of style and content, and use this information to impose control over the generation process. In this framework, style is learned in a fully unsupervised manner, while content is learned through both supervised learning (using the conditioning vector) and unsupervised learning (with the inference mechanism). We undergo two novel regularization steps to ensure content-style disentanglement. First, we minimize the shared information between content and style by introducing a novel application of the gradient reverse layer (GRL); second, we introduce a self-supervised regularization method to further separate information in the content and style variables. We show that in general, two latent variable models achieve better performance and give more control over the generated image. We also show that our proposed model (DRAI) achieves the best disentanglement score and has the best overall performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题