论文标题
深处是我们没有的奢侈品
Deep is a Luxury We Don't Have
论文作者
论文摘要
医疗图像以高分辨率出现。高分辨率对于在早期发现恶性组织至关重要。但是,该决议在建模远距离依赖性方面提出了挑战。浅变压器消除了这个问题,但它们遭受了二次复杂性。在本文中,我们通过利用线性自我注意近似来解决这种复杂性。通过这种近似,我们提出了一个称为HCT的有效视觉模型,该模型代表高分辨率卷积变压器。 HCT以明显降低的成本将变形金刚的优点带入了高分辨率图像。我们使用高分辨率乳房X线摄影数据集评估HCT。 HCT优于其CNN对应物。此外,我们通过评估其有效的接收场来证明HCT的医疗图像适合度。
Medical images come in high resolutions. A high resolution is vital for finding malignant tissues at an early stage. Yet, this resolution presents a challenge in terms of modeling long range dependencies. Shallow transformers eliminate this problem, but they suffer from quadratic complexity. In this paper, we tackle this complexity by leveraging a linear self-attention approximation. Through this approximation, we propose an efficient vision model called HCT that stands for High resolution Convolutional Transformer. HCT brings transformers' merits to high resolution images at a significantly lower cost. We evaluate HCT using a high resolution mammography dataset. HCT is significantly superior to its CNN counterpart. Furthermore, we demonstrate HCT's fitness for medical images by evaluating its effective receptive field.Code available at https://bit.ly/3ykBhhf