论文标题
一项关于手语识别的基于深度学习的方法的综合研究
A Comprehensive Study on Deep Learning-based Methods for Sign Language Recognition
论文作者
论文摘要
在本文中,对基于计算机视觉的方法进行了比较的实验评估,以进行手语识别。通过在该领域实施最新的深神经网络方法,可以对多个可公开可用的数据集进行全面评估。本研究的目的是提供有关手语识别的见解,重点是将非细分视频流映射到效果上。对于此任务,引入了两个新的序列培训标准,即语音和场景文本识别领域已知。此外,彻底讨论了大量预处理方案。最后,创建了一个新的希腊手语的RGB+D数据集。据我们所知,这是第一个为视频捕获提供句子和光泽级别注释的手语数据集。
In this paper, a comparative experimental assessment of computer vision-based methods for sign language recognition is conducted. By implementing the most recent deep neural network methods in this field, a thorough evaluation on multiple publicly available datasets is performed. The aim of the present study is to provide insights on sign language recognition, focusing on mapping non-segmented video streams to glosses. For this task, two new sequence training criteria, known from the fields of speech and scene text recognition, are introduced. Furthermore, a plethora of pretraining schemes is thoroughly discussed. Finally, a new RGB+D dataset for the Greek sign language is created. To the best of our knowledge, this is the first sign language dataset where sentence and gloss level annotations are provided for a video capture.