通用表示变压器层，用于几个图像分类

论文标题

通用表示变压器层，用于几个图像分类

A Universal Representation Transformer Layer for Few-Shot Image Classification

论文作者

Liu, Lu, Hamilton, William, Long, Guodong, Jiang, Jing, Larochelle, Hugo

论文摘要

只有少数样本显示时，很少有射击分类旨在识别看不见的类。我们考虑多域几乎没有图像分类的问题，在这些问题中，看不见的类和示例来自不同的数据源。这个问题引起了人们的兴趣越来越大，并激发了诸如元数据等基准的发展。在这种多域环境中的一个关键挑战是有效地整合了各种培训域的特征表示形式。在这里，我们提出了一个通用表示变压器（URT）层，该层通过动态重新加权和组成最合适的特定领域的表示来利用通用特征来利用通用特征来进行几次射击分类。在实验中，我们表明URT在元数据上设定了新的最新结果。具体而言，与竞争方法相比，它在最多的数据源上实现了最高绩效。我们分析了URT的变体，并提供了注意力评分热图的可视化，该图阐明了模型如何执行跨域概括。我们的代码可在https://github.com/liulu112601/urt上找到。

Few-shot classification aims to recognize unseen classes when presented with only a small number of samples. We consider the problem of multi-domain few-shot image classification, where unseen classes and examples come from diverse data sources. This problem has seen growing interest and has inspired the development of benchmarks such as Meta-Dataset. A key challenge in this multi-domain setting is to effectively integrate the feature representations from the diverse set of training domains. Here, we propose a Universal Representation Transformer (URT) layer, that meta-learns to leverage universal features for few-shot classification by dynamically re-weighting and composing the most appropriate domain-specific representations. In experiments, we show that URT sets a new state-of-the-art result on Meta-Dataset. Specifically, it achieves top-performance on the highest number of data sources compared to competing methods. We analyze variants of URT and present a visualization of the attention score heatmaps that sheds light on how the model performs cross-domain generalization. Our code is available at https://github.com/liulu112601/URT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题