半监督域的适应性，以下游任务损失为指导

论文标题

半监督域的适应性，以下游任务损失为指导

Semi-supervised domain adaptation with CycleGAN guided by a downstream task loss

论文作者

Mütze, Annika, Rottmann, Matthias, Gottschalk, Hanno

论文摘要

域的适应性引起了极大的兴趣，因为标签是一项昂贵且容易出错的任务，尤其是当像素级别（如语义分段）上需要标签时。因此，人们希望能够在合成域上训练神经网络，其中数据丰富并且标签精确。但是，这些模型通常在室外图像上表现不佳。为了减轻输入的变化，可以使用图像到图像的方法。然而，使用合成训练域桥接部署领域的标准图像到图像方法并不关注下游任务，而仅关注视觉检查级别。因此，我们在图像到图像域适应方法中提出了GAN的“任务意识”版本。借助少量标记的地面真实数据，我们将图像到图像转换为更合适的输入图像，用于培训合成数据（合成域专家）的语义分割网络。这项工作的主要贡献是1）通过训练下游任务Aware Cycean的训练，同时避免适应合成语义细分专家2）该方法适用于复杂的域适应任务和3）通过使用偏见的域域差距分析，一种证明该方法适用于使用合成的语义分段专家。我们在分类任务以及语义细分方面评估我们的方法。我们的实验表明，我们的方法的表现超过了Cyclegan（一种标准的图像到图像方法），在分类任务中仅使用70（10％）地面真相图像中的准确性点7％。对于语义细分，我们可以在训练过程中仅使用14个地面真相图像在城市景观评估数据集上的平均交叉点上约4％至7％的进步。

Domain adaptation is of huge interest as labeling is an expensive and error-prone task, especially when labels are needed on pixel-level like in semantic segmentation. Therefore, one would like to be able to train neural networks on synthetic domains, where data is abundant and labels are precise. However, these models often perform poorly on out-of-domain images. To mitigate the shift in the input, image-to-image approaches can be used. Nevertheless, standard image-to-image approaches that bridge the domain of deployment with the synthetic training domain do not focus on the downstream task but only on the visual inspection level. We therefore propose a "task aware" version of a GAN in an image-to-image domain adaptation approach. With the help of a small amount of labeled ground truth data, we guide the image-to-image translation to a more suitable input image for a semantic segmentation network trained on synthetic data (synthetic-domain expert). The main contributions of this work are 1) a modular semi-supervised domain adaptation method for semantic segmentation by training a downstream task aware CycleGAN while refraining from adapting the synthetic semantic segmentation expert 2) the demonstration that the method is applicable to complex domain adaptation tasks and 3) a less biased domain gap analysis by using from scratch networks. We evaluate our method on a classification task as well as on semantic segmentation. Our experiments demonstrate that our method outperforms CycleGAN - a standard image-to-image approach - by 7 percent points in accuracy in a classification task using only 70 (10%) ground truth images. For semantic segmentation we can show an improvement of about 4 to 7 percent points in mean Intersection over union on the Cityscapes evaluation dataset with only 14 ground truth images during training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题