一种多级方法 - 基于零局学习的文本描述构建视觉分类器

论文标题

一种多级方法 - 基于零局学习的文本描述构建视觉分类器

A Multi-class Approach -- Building a Visual Classifier based on Textual Descriptions using Zero-Shot Learning

论文作者

Sajjan, Preeti Jagdish, Glavin, Frank G.

论文摘要

用于图像分类的机器学习（ML）技术通常需要许多标记的图像来训练模型，并且在测试时，我们应该使用属于训练的图像。在本文中，我们克服了ML的两个主要障碍，即数据的稀缺性和分类模型的预测限制。我们通过引入视觉分类器，该分类器使用转移学习的概念，即零局学习（ZSL）和标准的自然语言处理技术。我们通过将标记的图像映射到其文本描述而不是为特定课程训练的分类器培训分类器。转移学习涉及转移知识跨越相似的领域。 ZSL在培训未来的识别任务时智能地应用了所学知识。 ZSL将类别分为两种类型：看到和看不见的类。看到的课程是我们训练模型的课程，看不见的课程是我们测试模型的课程。在训练阶段没有遇到看不见的班级的示例。该领域的较早研究重点是开发二进制分类器，但在本文中，我们提出了一种具有零射门学习方法的多级分类器。

Machine Learning (ML) techniques for image classification routinely require many labelled images for training the model and while testing, we ought to use images belonging to the same domain as those used for training. In this paper, we overcome the two main hurdles of ML, i.e. scarcity of data and constrained prediction of the classification model. We do this by introducing a visual classifier which uses a concept of transfer learning, namely Zero-Shot Learning (ZSL), and standard Natural Language Processing techniques. We train a classifier by mapping labelled images to their textual description instead of training it for specific classes. Transfer learning involves transferring knowledge across domains that are similar. ZSL intelligently applies the knowledge learned while training for future recognition tasks. ZSL differentiates classes as two types: seen and unseen classes. Seen classes are the classes upon which we have trained our model and unseen classes are the classes upon which we test our model. The examples from unseen classes have not been encountered in the training phase. Earlier research in this domain focused on developing a binary classifier but, in this paper, we present a multi-class classifier with a Zero-Shot Learning approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题