M2Caiseg：使用卷积神经网络对腹腔镜图像的语义分割

论文标题

M2Caiseg：使用卷积神经网络对腹腔镜图像的语义分割

m2caiSeg: Semantic Segmentation of Laparoscopic Images using Convolutional Neural Networks

论文作者

Maqbool, Salman, Riaz, Aqsa, Sajid, Hasan, Hasan, Osman

论文摘要

自主手术程序，特别是最小的侵入性手术，是人工智能研究的下一个领域。但是，现有的挑战包括对人体解剖结构和手术环境的精确识别，以及为训练自主剂的环境建模。为了解决人类解剖结构和外科手术环境的识别，我们提出了一种基于深度学习的语义分割算法，以识别和标记人躯干区域内窥镜视频饲料中的组织和器官。我们提出了一个注释的数据集M2Caiseg，它是由现实世界手术程序的内窥镜视频提要创建的。总体而言，数据由307张图像组成，每张图像都针对现场的器官和不同的手术仪器进行注释。我们建议并培训深层卷积神经网络，以完成语义分割任务。为了迎合大量注释数据的数量，我们使用无监督的预训练和数据增强。在提出的数据集的独立测试集上评估了训练的模型。在使用所有标记类别进行语义分割任务时，我们获得了0.33的F1分数。其次，我们将所有仪器标记为“仪器”超级类别，以评估模型在辨别各种器官方面的性能，并获得0.57的F1分数。我们提出了一个新的数据集和一种深度学习方法，用于内窥镜外科手术场景中各种器官和仪器的像素级别识别。手术场景的理解是自动化手术程序的第一步之一。

Autonomous surgical procedures, in particular minimal invasive surgeries, are the next frontier for Artificial Intelligence research. However, the existing challenges include precise identification of the human anatomy and the surgical settings, and modeling the environment for training of an autonomous agent. To address the identification of human anatomy and the surgical settings, we propose a deep learning based semantic segmentation algorithm to identify and label the tissues and organs in the endoscopic video feed of the human torso region. We present an annotated dataset, m2caiSeg, created from endoscopic video feeds of real-world surgical procedures. Overall, the data consists of 307 images, each of which is annotated for the organs and different surgical instruments present in the scene. We propose and train a deep convolutional neural network for the semantic segmentation task. To cater for the low quantity of annotated data, we use unsupervised pre-training and data augmentation. The trained model is evaluated on an independent test set of the proposed dataset. We obtained a F1 score of 0.33 while using all the labeled categories for the semantic segmentation task. Secondly, we labeled all instruments into an 'Instruments' superclass to evaluate the model's performance on discerning the various organs and obtained a F1 score of 0.57. We propose a new dataset and a deep learning method for pixel level identification of various organs and instruments in a endoscopic surgical scene. Surgical scene understanding is one of the first steps towards automating surgical procedures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题