一项关于手语识别的基于深度学习的方法的综合研究

论文标题

一项关于手语识别的基于深度学习的方法的综合研究

A Comprehensive Study on Deep Learning-based Methods for Sign Language Recognition

论文作者

Adaloglou, Nikolas, Chatzis, Theocharis, Papastratis, Ilias, Stergioulas, Andreas, Papadopoulos, Georgios Th., Zacharopoulou, Vassia, Xydopoulos, George J., Atzakas, Klimnis, Papazachariou, Dimitris, Daras, Petros

论文摘要

在本文中，对基于计算机视觉的方法进行了比较的实验评估，以进行手语识别。通过在该领域实施最新的深神经网络方法，可以对多个可公开可用的数据集进行全面评估。本研究的目的是提供有关手语识别的见解，重点是将非细分视频流映射到效果上。对于此任务，引入了两个新的序列培训标准，即语音和场景文本识别领域已知。此外，彻底讨论了大量预处理方案。最后，创建了一个新的希腊手语的RGB+D数据集。据我们所知，这是第一个为视频捕获提供句子和光泽级别注释的手语数据集。

In this paper, a comparative experimental assessment of computer vision-based methods for sign language recognition is conducted. By implementing the most recent deep neural network methods in this field, a thorough evaluation on multiple publicly available datasets is performed. The aim of the present study is to provide insights on sign language recognition, focusing on mapping non-segmented video streams to glosses. For this task, two new sequence training criteria, known from the fields of speech and scene text recognition, are introduced. Furthermore, a plethora of pretraining schemes is thoroughly discussed. Finally, a new RGB+D dataset for the Greek sign language is created. To the best of our knowledge, this is the first sign language dataset where sentence and gloss level annotations are provided for a video capture.

下载PDF全文

下载文献需遵守相关版权规定

论文标题