胆管上2021：手术动作三重态识别的基准挑战

论文标题

胆管上2021：手术动作三重态识别的基准挑战

CholecTriplet2021: A benchmark challenge for surgical action triplet recognition

论文作者

Nwoye, Chinedu Innocent, Alapatt, Deepak, Yu, Tong, Vardazaryan, Armine, Xia, Fangfang, Zhao, Zixuan, Xia, Tong, Jia, Fucang, Yang, Yuxuan, Wang, Hao, Yu, Derong, Zheng, Guoyan, Duan, Xiaotian, Getty, Neil, Sanchez-Matilla, Ricardo, Robu, Maria, Zhang, Li, Chen, Huabin, Wang, Jiacheng, Wang, Liansheng, Zhang, Bokai, Gerats, Beerend, Raviteja, Sista, Sathish, Rachana, Tao, Rong, Kondo, Satoshi, Pang, Winnie, Ren, Hongliang, Abbing, Julian Ronald, Sarhan, Mohammad Hasan, Bodenstedt, Sebastian, Bhasker, Nithya, Oliveira, Bruno, Torres, Helena R., Ling, Li, Gaida, Finn, Czempiel, Tobias, Vilaça, João L., Morais, Pedro, Fonseca, Jaime, Egging, Ruby Mae, Wijma, Inge Nicole, Qian, Chen, Bian, Guibin, Li, Zhen, Balasubramanian, Velmurugan, Sheet, Debdoot, Luengo, Imanol, Zhu, Yuanbo, Ding, Shuai, Aschenbrenner, Jakob-Anton, van der Kar, Nicolas Elini, Xu, Mengya, Islam, Mobarakol, Seenivasan, Lalithkumar, Jenke, Alexander, Stoyanov, Danail, Mutter, Didier, Mascagni, Pietro, Seeliger, Barbara, Gonzalez, Cristians, Padoy, Nicolas

论文摘要

通过利用手术工作流程分析的实时反馈，手术室中的上下文感知决策支持可以促进手术安全和效率。大多数现有的作品都以粗粒度的水平（例如阶段，步骤或事件）识别手术活动，从而遗漏了有关手术活动的细粒度相互作用的细节；然而，需要这些在手术室提供更多有用的AI帮助。识别手术动作是<仪器，动词，目标>组合的三胞胎，提供了有关手术视频中活动的全面细节。本文介绍了胆谱2021：在Miccai 2021组织的内窥镜视觉挑战，以识别腹腔镜视频中的手术动作三重态。挑战授予了对大规模Cholect50数据集的私人访问权限，该数据集用Action Triplet信息注释。在本文中，我们介绍了挑战和评估参与者在挑战期间提出的最先进的深度学习方法。提出了来自挑战组织者的总共4种基线方法和19种由竞争团队的新的深度学习算法，以直接从外科手术视频中识别手术动作三胞胎，从而达到平均平均精度（地图），范围为4.2％至38.1％。这项研究还分析了通过提出的方法获得的结果的重要性，对它们之间进行了彻底的方法学比较，深入的结果分析，并提出了一种新颖的集合方法来增强识别。我们的分析表明，手术工作流程分析尚未解决，还突出了有趣的方向，用于未来研究细粒度手术活动识别，这对于手术中AI的发展至关重要。

Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of <instrument, verb, target> combination delivers comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms by competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.

下载PDF全文

下载文献需遵守相关版权规定

论文标题