基于变压器的对比度学习方法，用于几种手语识别

论文标题

基于变压器的对比度学习方法，用于几种手语识别

A Transformer-Based Contrastive Learning Approach for Few-Shot Sign Language Recognition

论文作者

Ferreira, Silvan, Costa, Esdras, Dahia, Márcio, Rocha, Jampierre

论文摘要

来自单眼图像或2D姿势序列的手语识别是一个具有挑战性的领域，这不仅是由于难以从2D数据中推断出3D信息，而且还因为信息序列之间的时间关系。此外，各种各样的迹象以及在生产环境中添加新的迹象的不断需求使使用传统分类技术是不可行的。我们提出了一种新型的基于变压器的模型，该模型证明了从人体要点序列中学习丰富的表示，从而可以在矢量嵌入之间进行更好的比较。这使我们能够应用这些技术来执行单次或几次射击任务，例如分类和翻译。实验表明，该模型可以很好地概括，并在训练过程中从未见过的标志类别获得了竞争成果。

Sign language recognition from sequences of monocular images or 2D poses is a challenging field, not only due to the difficulty to infer 3D information from 2D data, but also due to the temporal relationship between the sequences of information. Additionally, the wide variety of signs and the constant need to add new ones on production environments makes it infeasible to use traditional classification techniques. We propose a novel Contrastive Transformer-based model, which demonstrate to learn rich representations from body key points sequences, allowing better comparison between vector embedding. This allows us to apply these techniques to perform one-shot or few-shot tasks, such as classification and translation. The experiments showed that the model could generalize well and achieved competitive results for sign classes never seen in the training process.

下载PDF全文

下载文献需遵守相关版权规定

论文标题