论文标题
放射科医生级的性能通过使用深度学习对MRI扫描进行乳腺癌的分割
Radiologist-level Performance by Using Deep Learning for Segmentation of Breast Cancers on MRI Scans
论文作者
论文摘要
目的:开发深层网络体系结构,该结构将在乳房MRI上实现癌症的完全自动化的放射科医生级分割。 材料和方法:在这项回顾性研究中,在女性患者(年龄范围为12-94岁;平均年龄,52岁+/- 10 [标准偏差])中,进行了38229次检查(由14475例患者的64063次单独乳房扫描),他们在2002年至2014年之间在单个临床部位进行了。总共选择了2555种乳腺癌,这些癌症已被放射科医生在二维(2D)图像上进行了分割,以及60108良性乳房,这些乳腺癌是非癌组织的例子。所有这些都用于模型培训。为了进行测试,四位放射科医生在2D图像上独立细分了250个乳腺癌。作者在几种三维(3D)深度卷积神经网络体系结构,输入方式和协调方法中选出。结果度量是2D分割的骰子得分,通过使用Wilcoxon签名的等级测试和两个单方面的测试程序,将其在网络和放射科医生之间进行比较。 结果:训练集上表现最高的网络是一个3D U-NET,具有动态对比度增强的MRI作为输入,并且对每次检查的强度都进行了归一化。在测试集中,该网络的中间骰子得分为0.77(四分位数范围为0.26)。网络的性能等效于放射科医生(两个单侧测试程序,放射科医生性能为0.69-0.84作为等效界限,两者的p <= .001; n = 250)。 结论:当在足够大的数据集中接受培训时,在常规临床MRI上详细介绍了乳腺癌的2D分割,开发的3D U-NET以及奖学金培训的放射科医生进行了培训。
Purpose: To develop a deep network architecture that would achieve fully automated radiologist-level segmentation of cancers at breast MRI. Materials and Methods: In this retrospective study, 38229 examinations (composed of 64063 individual breast scans from 14475 patients) were performed in female patients (age range, 12-94 years; mean age, 52 years +/- 10 [standard deviation]) who presented between 2002 and 2014 at a single clinical site. A total of 2555 breast cancers were selected that had been segmented on two-dimensional (2D) images by radiologists, as well as 60108 benign breasts that served as examples of noncancerous tissue; all these were used for model training. For testing, an additional 250 breast cancers were segmented independently on 2D images by four radiologists. Authors selected among several three-dimensional (3D) deep convolutional neural network architectures, input modalities, and harmonization methods. The outcome measure was the Dice score for 2D segmentation, which was compared between the network and radiologists by using the Wilcoxon signed rank test and the two one-sided test procedure. Results: The highest-performing network on the training set was a 3D U-Net with dynamic contrast-enhanced MRI as input and with intensity normalized for each examination. In the test set, the median Dice score of this network was 0.77 (interquartile range, 0.26). The performance of the network was equivalent to that of the radiologists (two one-sided test procedures with radiologist performance of 0.69-0.84 as equivalence bounds, P <= .001 for both; n = 250). Conclusion: When trained on a sufficiently large dataset, the developed 3D U-Net performed as well as fellowship-trained radiologists in detailed 2D segmentation of breast cancers at routine clinical MRI.