具有深度度量变量自动编码器的多模式数据生成

论文标题

具有深度度量变量自动编码器的多模式数据生成

Multi-modal data generation with a deep metric variational autoencoder

论文作者

Sundgaard, Josefine Vilsbøll, Hannemose, Morten Rieger, Laugesen, Søren, Bray, Peter, Harte, James, Kamide, Yosuke, Tanaka, Chiemi, Paulsen, Rasmus R., Christensen, Anders Nymark

论文摘要

我们提出了一个深度度量自动编码器，用于多模式数据生成。变异自动编码器在潜在空间中采用三重态损失，这可以通过在每个类群集中的潜在空间中采样来生成有条件的数据。该方法在由鼓膜膜的耳镜图像组成的多模式数据集上进行评估，并具有相应的宽带鼓膜测量值。该数据集中的模式是相关的，因为它们代表了中耳状态的不同方面，但它们不存在直接的像素到像素相关性。该方法显示出有条件地生成图像和鼓膜图的有条件的结果，并将允许从多模式源的数据增强数据。

We present a deep metric variational autoencoder for multi-modal data generation. The variational autoencoder employs triplet loss in the latent space, which allows for conditional data generation by sampling in the latent space within each class cluster. The approach is evaluated on a multi-modal dataset consisting of otoscopy images of the tympanic membrane with corresponding wideband tympanometry measurements. The modalities in this dataset are correlated, as they represent different aspects of the state of the middle ear, but they do not present a direct pixel-to-pixel correlation. The approach shows promising results for the conditional generation of pairs of images and tympanograms, and will allow for efficient data augmentation of data from multi-modal sources.

下载PDF全文

下载文献需遵守相关版权规定

论文标题