有选择地增加gan生成样品的多样性

论文标题

有选择地增加gan生成样品的多样性

Selectively increasing the diversity of GAN-generated samples

论文作者

Dubiński, Jan, Deja, Kamil, Wenzel, Sandro, Rokita, Przemysław, Trzciński, Tomasz

论文摘要

生成对抗网络（GAN）是能够合成数据样本的强大模型，与真实数据的分布非常相似，但是由于所谓的模式崩溃现象在GAN中观察到了这些生成样品的多样性受到限制。特别容易崩溃的是有条件的gan，它们倾向于忽略输入噪声矢量并专注于条件信息。提议减轻这种限制的最新方法增加了生成的样品的多样性，但是当需要样品相似性时，它们会降低模型的性能。为了解决这一缺点，我们提出了一种新颖的方法，以选择性地增加甘恩生成样品的多样性。通过在训练损失函数中添加简单但有效的正则化，我们鼓励发电机发现与各种输出相关的输入的新数据模式，同时为其余样本生成一致的样本。更确切地说，我们根据给定条件输入的样品的多样性来最大化生成图像与输入潜在向量之间距离的距离之比。我们在合成基准测试中显示了我们方法的优越性，以及在CERN LHC中模拟Alice实验零度量热量的数据的现实情况。

Generative Adversarial Networks (GANs) are powerful models able to synthesize data samples closely resembling the distribution of real data, yet the diversity of those generated samples is limited due to the so-called mode collapse phenomenon observed in GANs. Especially prone to mode collapse are conditional GANs, which tend to ignore the input noise vector and focus on the conditional information. Recent methods proposed to mitigate this limitation increase the diversity of generated samples, yet they reduce the performance of the models when similarity of samples is required. To address this shortcoming, we propose a novel method to selectively increase the diversity of GAN-generated samples. By adding a simple, yet effective regularization to the training loss function we encourage the generator to discover new data modes for inputs related to diverse outputs while generating consistent samples for the remaining ones. More precisely, we maximise the ratio of distances between generated images and input latent vectors scaling the effect according to the diversity of samples for a given conditional input. We show the superiority of our method in a synthetic benchmark as well as a real-life scenario of simulating data from the Zero Degree Calorimeter of ALICE experiment in LHC, CERN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题