使用LEAP运动传感器，用于手语识别的统计和时空手势特征

论文标题

使用LEAP运动传感器，用于手语识别的统计和时空手势特征

Statistical and Spatio-temporal Hand Gesture Features for Sign Language Recognition using the Leap Motion Sensor

论文作者

Bird, Jordan J.

论文摘要

在现代社会中，不应根据其残疾来确定人们的身份，而是可以使人们失去损害的人的环境。自动手语识别（SLR）的改进将通过数字技术导致更具促进的环境。 SLR的许多最先进的方法都集中在静态手势的分类上，但是交流是一种时间活动，这是许多动态手势所反映的。鉴于此，在SLR中不经常考虑交付手势期间的时间信息。这项工作的实验考虑了SL手势识别的问题，这些问题涉及动态手势在交付过程中的变化，本研究的目的是探索单一类型的特征以及混合特征如何影响机器学习模型的分类能力。 18通过LEAP运动控制器传感器记录的共同手势提供了一个复杂的分类问题。从0.6第二个时间窗口，统计描述符和时空属性中提取两组功能。每个组的功能都由其ANOVA F评分和P值进行比较，该功能将每个步骤的10个功能的垃圾箱排列在250个最高功能的限制中。结果表明，最佳统计模型选择了240个功能，并获得了85.96％的精度，最佳时空模型选择了230个功能，并得分为80.98％，最佳的混合功能模型从每组中选择了240个功能，从而使分类精度为86.75％。当比较所有三组结果（146个单独的机器学习模型）时，总体分布表明，与两组单组功能中的任何一个相比，输入是多种混合特征时，最小结果会增加。

In modern society, people should not be identified based on their disability, rather, it is environments that can disable people with impairments. Improvements to automatic Sign Language Recognition (SLR) will lead to more enabling environments via digital technology. Many state-of-the-art approaches to SLR focus on the classification of static hand gestures, but communication is a temporal activity, which is reflected by many of the dynamic gestures present. Given this, temporal information during the delivery of a gesture is not often considered within SLR. The experiments in this work consider the problem of SL gesture recognition regarding how dynamic gestures change during their delivery, and this study aims to explore how single types of features as well as mixed features affect the classification ability of a machine learning model. 18 common gestures recorded via a Leap Motion Controller sensor provide a complex classification problem. Two sets of features are extracted from a 0.6 second time window, statistical descriptors and spatio-temporal attributes. Features from each set are compared by their ANOVA F-Scores and p-values, arranged into bins grown by 10 features per step to a limit of the 250 highest-ranked features. Results show that the best statistical model selected 240 features and scored 85.96% accuracy, the best spatio-temporal model selected 230 features and scored 80.98%, and the best mixed-feature model selected 240 features from each set leading to a classification accuracy of 86.75%. When all three sets of results are compared (146 individual machine learning models), the overall distribution shows that the minimum results are increased when inputs are any number of mixed features compared to any number of either of the two single sets of features.

下载PDF全文

下载文献需遵守相关版权规定

论文标题