FlexGrid2Vec：学习有效的视觉表示向量

论文标题

FlexGrid2Vec：学习有效的视觉表示向量

flexgrid2vec: Learning Efficient Visual Representations Vectors

论文作者

Hamdi, Ali, Kim, Du Yong, Salim, Flora D.

论文摘要

我们提出了FlexGrid2Vec，这是一种用于图像表示学习的新方法。现有的视觉表示方法遇到了多个问题，包括需要高度密集的计算，失去深入结构信息的风险以及该方法对某些形状或对象的特异性。 FlexGrid2VEC将图像转换为低维特征向量。我们用灵活的，独特的节点位置和边缘距离表示每个图像。 FlexGrid2Vec是一种多渠道GCN，可以学习最具代表性的图像贴片的功能。我们已经研究了GCN节点插入的光谱和非光谱实现。具体而言，我们已经基于不同的节点 - 聚集方法实现了FlexGrid2VEC，例如向量求和，串联和特征向量中心的归一化。我们将FlexGrid2Vec的性能与一组在二进制和多类图像分类任务上的最先进的视觉表示模型进行了比较。尽管我们利用了不平衡，低尺寸和低分辨率数据集，但FlexGrid2VEC对众所周知的基本分类器显示出稳定且出色的结果。 flexgrid2vec在CIFAR-10上获得96.23％的占96.23％的占CIFAR-100的83.05％，STL-10的94.50％，Asirra的98.8％，可可数据集的89.69％。

We propose flexgrid2vec, a novel approach for image representation learning. Existing visual representation methods suffer from several issues, including the need for highly intensive computation, the risk of losing in-depth structural information and the specificity of the method to certain shapes or objects. flexgrid2vec converts an image to a low-dimensional feature vector. We represent each image with a graph of flexible, unique node locations and edge distances. flexgrid2vec is a multi-channel GCN that learns features of the most representative image patches. We have investigated both spectral and non-spectral implementations of the GCN node-embedding. Specifically, we have implemented flexgrid2vec based on different node-aggregation methods, such as vector summation, concatenation and normalisation with eigenvector centrality. We compare the performance of flexgrid2vec with a set of state-of-the-art visual representation learning models on binary and multi-class image classification tasks. Although we utilise imbalanced, low-size and low-resolution datasets, flexgrid2vec shows stable and outstanding results against well-known base classifiers. flexgrid2vec achieves 96.23% on CIFAR-10, 83.05% on CIFAR-100, 94.50% on STL-10, 98.8% on ASIRRA and 89.69% on the COCO dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题