论文标题

对称扩张的卷积,以进行手术识别

Symmetric Dilated Convolution for Surgical Gesture Recognition

论文作者

Zhang, Jinglu, Nie, Yinyu, Lyu, Yao, Li, Hailin, Chang, Jian, Yang, Xiaosong, Zhang, Jian Jun

论文摘要

自动手术手势识别是手术室内计算机辅助和客观手术技能评估的先决条件。先前的工作要么需要其他传感器来收集运动学数据,要么对捕获长期和未经修饰的手术视频的时间信息有局限性。为了应对这些挑战,我们提出了一种新型的时间卷积体系结构,以自动检测和段性手术手势,仅使用RGB视频使用相应的边界。我们使用由自我发言模块桥接的对称扩张结构来设计我们的方法,以编码和解码长期的时间模式,并相应地建立框架之间的关系。我们验证方法对拼图数据集的基本机器人缝合任务的有效性。实验结果证明了我们方法捕获长期框架依赖性的能力,这在很大程度上优于框架准确性上最新的方法,最高约为6分,而F1@50得分〜6分。

Automatic surgical gesture recognition is a prerequisite of intra-operative computer assistance and objective surgical skill assessment. Prior works either require additional sensors to collect kinematics data or have limitations on capturing temporal information from long and untrimmed surgical videos. To tackle these challenges, we propose a novel temporal convolutional architecture to automatically detect and segment surgical gestures with corresponding boundaries only using RGB videos. We devise our method with a symmetric dilation structure bridged by a self-attention module to encode and decode the long-term temporal patterns and establish the frame-to-frame relationship accordingly. We validate the effectiveness of our approach on a fundamental robotic suturing task from the JIGSAWS dataset. The experiment results demonstrate the ability of our method on capturing long-term frame dependencies, which largely outperform the state-of-the-art methods on the frame-wise accuracy up to ~6 points and the F1@50 score ~6 points.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源