论文标题

少更多:时间戳监督的手术阶段识别

Less is More: Surgical Phase Recognition from Timestamp Supervision

论文作者

Ding, Xinpeng, Yan, Xinjian, Wang, Zixun, Zhao, Wei, Zhuang, Jian, Xu, Xiaowei, Li, Xiaomeng

论文摘要

手术阶段识别是计算机辅助手术系统中的一项基本任务。大多数现有作品都在昂贵且耗时的完整注释的监督下,这些注释要求外科医生重复观看视频,以找到手术阶段的精确开始和结束时间。在本文中,我们介绍了时间戳监督手术期识别,以用时间戳注释训练模型,在该模型中,要求外科医生在阶段的时间边界内仅识别单个时间戳。与完整注释相比,该注释可以大大降低手动注释成本。为了充分利用此类时间戳监督,我们提出了一种称为不确定性感知时间扩散(UATD)的新方法,以生成可信赖的伪标签进行培训。我们提出的UATD是出于手术视频的属性的动机,即,这些阶段是由连续帧组成的长期事件。具体而言,UATD以迭代方式将单个标记的时间戳扩散到其相应的高自信(即低不确定性)邻居框架中。我们的研究通过时间戳监管发现了对手术期识别的独特见解:1)与完整注释相比,时间戳注释可以减少74%的注释时间,而外科医生倾向于将这些时间戳在阶段中间附近注释; 2)广泛的实验表明,与完整的监督方法相比,我们的方法可以实现竞争成果,同时降低了手动注释成本; 3)较少的手术阶段识别,即较少但歧视性伪标签的表现要优于满足,但包含模棱两可的框架。 4)提出的UATD可用作插头和播放方法,以清洁阶段之间边界附近的模棱两可的标签,并提高当前手术期识别方法的性能。

Surgical phase recognition is a fundamental task in computer-assisted surgery systems. Most existing works are under the supervision of expensive and time-consuming full annotations, which require the surgeons to repeat watching videos to find the precise start and end time for a surgical phase. In this paper, we introduce timestamp supervision for surgical phase recognition to train the models with timestamp annotations, where the surgeons are asked to identify only a single timestamp within the temporal boundary of a phase. This annotation can significantly reduce the manual annotation cost compared to the full annotations. To make full use of such timestamp supervisions, we propose a novel method called uncertainty-aware temporal diffusion (UATD) to generate trustworthy pseudo labels for training. Our proposed UATD is motivated by the property of surgical videos, i.e., the phases are long events consisting of consecutive frames. To be specific, UATD diffuses the single labelled timestamp to its corresponding high confident ( i.e., low uncertainty) neighbour frames in an iterative way. Our study uncovers unique insights of surgical phase recognition with timestamp supervisions: 1) timestamp annotation can reduce 74% annotation time compared with the full annotation, and surgeons tend to annotate those timestamps near the middle of phases; 2) extensive experiments demonstrate that our method can achieve competitive results compared with full supervision methods, while reducing manual annotation cost; 3) less is more in surgical phase recognition, i.e., less but discriminative pseudo labels outperform full but containing ambiguous frames; 4) the proposed UATD can be used as a plug and play method to clean ambiguous labels near boundaries between phases, and improve the performance of the current surgical phase recognition methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源