在静止图像和视频中进行动作分类的判别词典设计

论文标题

在静止图像和视频中进行动作分类的判别词典设计

Discriminative Dictionary Design for Action Classification in Still Images and Videos

论文作者

Roy, Abhinaba, Banerjee, Biplab, Hussain, Amir, Poria, Soujanya

论文摘要

在本文中，我们解决了静止图像和视频的行动识别问题。传统的本地特征，例如SIFT，Stip等。总是构成两个潜在的问题：1）它们在给定类别的不同实体中均匀分布，并且2）许多此类功能不包括实体所代表的视觉概念。为了产生上述问题的词典，我们提出了一种新颖的歧视方法，用于识别强大和类别的特定局部特征，从而在更大程度上最大程度地降低了类可分离性。具体而言，我们提出了有效的本地描述符的选择，作为基于过滤的特征选择问题，该问题根据新颖的独特性度量对每个类别的本地特征进行排名。随后基于学习的字典来表示潜在的视觉实体，并使用随机森林模型进行动作分类，然后进行标签传播改进。该框架在基于静止图像（Stanford-40）以及视频（UCF-50）的动作识别数据集上进行了验证，并且比文献中的代表性方法表现出了出色的性能。

In this paper, we address the problem of action recognition from still images and videos. Traditional local features such as SIFT, STIP etc. invariably pose two potential problems: 1) they are not evenly distributed in different entities of a given category and 2) many of such features are not exclusive of the visual concept the entities represent. In order to generate a dictionary taking the aforementioned issues into account, we propose a novel discriminative method for identifying robust and category specific local features which maximize the class separability to a greater extent. Specifically, we pose the selection of potent local descriptors as filtering based feature selection problem which ranks the local features per category based on a novel measure of distinctiveness. The underlying visual entities are subsequently represented based on the learned dictionary and this stage is followed by action classification using the random forest model followed by label propagation refinement. The framework is validated on the action recognition datasets based on still images (Stanford-40) as well as videos (UCF-50) and exhibits superior performances than the representative methods from the literature.

下载PDF全文

下载文献需遵守相关版权规定

论文标题