论文标题
视觉推理的可区分自适应计算时间
Differentiable Adaptive Computation Time for Visual Reasoning
论文作者
论文摘要
本文提出了一种基于注意力的新型算法,用于实现称为DACT的自适应计算,该算法与现有的计算不同,该计算是端到端的。我们的方法可以与许多网络结合使用。特别是,我们研究了其在广为人知的MAC体系结构中的应用,从而大大减少了实现相似精确度所需的经常性步骤的数量,从而提高了其性能与计算比率。此外,我们表明,通过增加所使用的最大步骤数,我们超越了CLEVR数据集中最佳的非自适应MAC的准确性,这表明我们的方法能够控制步骤的数量而不会显着丧失性能。我们方法提供的其他优势包括通过丢弃无用的步骤并提供更多有关基本推理过程的见解来大大提高可解释性。最后,我们将自适应计算作为类似于专家公式的混合物的模型集合的一种。我们的实验代码和配置文件均可用来支持该领域的进一步研究。
This paper presents a novel attention-based algorithm for achieving adaptive computation called DACT, which, unlike existing ones, is end-to-end differentiable. Our method can be used in conjunction with many networks; in particular, we study its application to the widely known MAC architecture, obtaining a significant reduction in the number of recurrent steps needed to achieve similar accuracies, therefore improving its performance to computation ratio. Furthermore, we show that by increasing the maximum number of steps used, we surpass the accuracy of even our best non-adaptive MAC in the CLEVR dataset, demonstrating that our approach is able to control the number of steps without significant loss of performance. Additional advantages provided by our approach include considerably improving interpretability by discarding useless steps and providing more insights into the underlying reasoning process. Finally, we present adaptive computation as an equivalent to an ensemble of models, similar to a mixture of expert formulation. Both the code and the configuration files for our experiments are made available to support further research in this area.