论文标题
一次强大的分解一些因素
Robust Disentanglement of a Few Factors at a Time
论文作者
论文摘要
由于数据的分解表示改善了下游任务中的概括,可解释性和绩效,因此分解是无监督学习的最前沿。当前的无监督方法对于实际数据集仍然不适用,因为它们的性能很大,并且无法达到(半)监督方法的分离水平。我们介绍了基于人群的培训(PBT),以提高培训培训自动编码器(VAE)的一致性,并在监督环境(PBT-VAE)中证明了这种方法的有效性。然后,我们将无监督的分解排名(UDR)用作无监督的启发式启发式,以在我们的PBT-VAE培训中得分模型,并显示了这种方式训练的模型往往只能仅将生成因素的子集始终如一地解散。在此观察之上,我们介绍了递归的RPU-VAE方法。我们训练模型直到收敛,从数据集中删除学习因素并重申。在此过程中,我们可以将数据集的子集用学习的因素标记,并连续使用这些标签来训练一个完全删除整个数据集的模型。通过这种方法,我们显示了多个数据集和指标的最先进的无监督分离性能和鲁棒性的显着改善。
Disentanglement is at the forefront of unsupervised learning, as disentangled representations of data improve generalization, interpretability, and performance in downstream tasks. Current unsupervised approaches remain inapplicable for real-world datasets since they are highly variable in their performance and fail to reach levels of disentanglement of (semi-)supervised approaches. We introduce population-based training (PBT) for improving consistency in training variational autoencoders (VAEs) and demonstrate the validity of this approach in a supervised setting (PBT-VAE). We then use Unsupervised Disentanglement Ranking (UDR) as an unsupervised heuristic to score models in our PBT-VAE training and show how models trained this way tend to consistently disentangle only a subset of the generative factors. Building on top of this observation we introduce the recursive rPU-VAE approach. We train the model until convergence, remove the learned factors from the dataset and reiterate. In doing so, we can label subsets of the dataset with the learned factors and consecutively use these labels to train one model that fully disentangles the whole dataset. With this approach, we show striking improvement in state-of-the-art unsupervised disentanglement performance and robustness across multiple datasets and metrics.