衡量内容式删除的偏见和有效性

论文标题

衡量内容式删除的偏见和有效性

Measuring the Biases and Effectiveness of Content-Style Disentanglement

论文作者

Liu, Xiao, Thermos, Spyridon, Valvano, Gabriele, Chartsias, Agisilaos, O'Neil, Alison, Tsaftaris, Sotirios A.

论文摘要

最新的最先进的半监督和无监督的解决方案将图像“内容”编码为空间张量和图像外观或“样式”或“样式”，以实现空间上符号的任务（例如，图像到图像翻译）中的良好性能。为了实现这一目标，他们采用了不同的模型设计，学习目标和数据偏见。尽管已经付出了巨大的努力来衡量向量表示中的分离，并评估其对任务绩效的影响，但缺乏（空间）内容的分析 - 样式分离。在本文中，我们进行了一项实证研究，以研究不同偏见在内容式分解环境中的作用，并揭示了分解程度和任务绩效程度之间的关系。特别是，我们考虑了我们的设置：（i）确定三种流行的内容式脱离模型的关键设计选择和学习限制；（ii）以消融方式放松或删除此类约束；（iii）使用两个指标来衡量分离程度并评估其对每个任务绩效的影响。我们的实验表明，分离，任务绩效和 - 令人惊讶的内容解释性之间存在一个“最佳位置”，这表明盲目强迫更高的分解会损害模型性能和内容因素的语义。我们的发现以及与任务无关的指标可用于指导内容式表示形式有用的任务的新模型的设计和选择。

A recent spate of state-of-the-art semi- and un-supervised solutions disentangle and encode image "content" into a spatial tensor and image appearance or "style" into a vector, to achieve good performance in spatially equivariant tasks (e.g. image-to-image translation). To achieve this, they employ different model design, learning objective, and data biases. While considerable effort has been made to measure disentanglement in vector representations, and assess its impact on task performance, such analysis for (spatial) content - style disentanglement is lacking. In this paper, we conduct an empirical study to investigate the role of different biases in content-style disentanglement settings and unveil the relationship between the degree of disentanglement and task performance. In particular, we consider the setting where we: (i) identify key design choices and learning constraints for three popular content-style disentanglement models; (ii) relax or remove such constraints in an ablation fashion; and (iii) use two metrics to measure the degree of disentanglement and assess its effect on each task performance. Our experiments reveal that there is a "sweet spot" between disentanglement, task performance and - surprisingly - content interpretability, suggesting that blindly forcing for higher disentanglement can hurt model performance and content factors semanticness. Our findings, as well as the used task-independent metrics, can be used to guide the design and selection of new models for tasks where content-style representations are useful.

下载PDF全文

下载文献需遵守相关版权规定

论文标题