高分辨率弱监督的医学图像架构

论文标题

高分辨率弱监督的医学图像架构

High resolution weakly supervised localization architectures for medical images

论文作者

Preechakul, Konpat, Sriswasdi, Sira, Kijsirikul, Boonserm, Chuangsuwanich, Ekapol

论文摘要

在医学成像中，类激活图（CAM）通过指向感兴趣的区域是主要的解释性工具。由于CAM的定位精度受模型特征图的分辨率的限制，因此人们可能期望分割模型通常具有较大的特征图，它将产生更准确的CAM。但是，我们发现由于任务不匹配而不是这种情况。尽管为具有像素级注释的数据集开发了分割模型，但在大多数医学成像数据集中只能使用图像级注释。我们的实验表明，全球平均合并（GAP）和组归一化是使CAM定位准确性恶化的主要罪魁祸首。为了解决这个问题，我们提出了金字塔定位网络（PYLON），这是一种高素质弱监督定位的模型，在NIH的胸部X射线14数据集上达到了0.62的平均点定位精度，而传统CAM模型则达到0.45。源代码和扩展结果可从https://github.com/cmb-chula/pylon获得。

In medical imaging, Class-Activation Map (CAM) serves as the main explainability tool by pointing to the region of interest. Since the localization accuracy from CAM is constrained by the resolution of the model's feature map, one may expect that segmentation models, which generally have large feature maps, would produce more accurate CAMs. However, we have found that this is not the case due to task mismatch. While segmentation models are developed for datasets with pixel-level annotation, only image-level annotation is available in most medical imaging datasets. Our experiments suggest that Global Average Pooling (GAP) and Group Normalization are the main culprits that worsen the localization accuracy of CAM. To address this issue, we propose Pyramid Localization Network (PYLON), a model for high-accuracy weakly-supervised localization that achieved 0.62 average point localization accuracy on NIH's Chest X-Ray 14 dataset, compared to 0.45 for a traditional CAM model. Source code and extended results are available at https://github.com/cmb-chula/pylon.

下载PDF全文

下载文献需遵守相关版权规定

论文标题