生物医学高光谱图像数据的无监督分割数据：通过卷积自动编码器应对高维度

论文标题

生物医学高光谱图像数据的无监督分割数据：通过卷积自动编码器应对高维度

Unsupervised segmentation of biomedical hyperspectral image data: tackling high dimensionality with convolutional autoencoders

论文作者

Bench, Ciaran, Nallala, Jayakrupakar, Wang, Chun-Chin, Sheridan, Hannah, Stone, Nicholas

论文摘要

有关活检标本的结构和组成的信息可以帮助疾病监测和诊断。原则上，这可以从拉曼和红外（IR）高光谱图像（HSI）中获取，该图像编码有关样品的组成分子如何在太空中排列的信息。每个组织截面/组件都由空间和光谱特征的独特组合定义，但是鉴于HSI数据集的高维度，将其提取和利用它们以分割图像是不平凡的。在这里，我们展示了如何通过首先检测和压缩HSI贴片中的相关特征来以端到端方式执行此任务的网络如何执行此任务，然后将包含类似空格范围特征的斑块组合在一起的聚类步骤。与i）相比，我们展示了使用这种端到端空间 - 光谱分割方法的优点，而不是以端到端方式训练的相同时空光谱技术，以及ii）仅使用猪组织模拟的HSIS作为测试示例，仅利用光谱特征（频谱K-Means）的方法。其次，我们描述了使用三种不同的CAY架构的潜在优势/局限性：通用2D CAE，一个通用3D CAE和2D CNN体系结构，灵感来自最近提出的UWU NET，该体系结构专门用于从HSI数据中提取功能。我们评估了它们在实际结肠样品的红外HSIS上的表现。我们发现，所有架构均能够产生分割，这些分割表现出与He染色的相邻组织切片的良好对应关系，用作近似地面真理，表明CAE驱动的方法的鲁棒性用于分割生物医学HSI数据。此外，我们强调需要更准确的地面真相信息，以严格比较每个架构所提供的优势。

Information about the structure and composition of biopsy specimens can assist in disease monitoring and diagnosis. In principle, this can be acquired from Raman and infrared (IR) hyperspectral images (HSIs) that encode information about how a sample's constituent molecules are arranged in space. Each tissue section/component is defined by a unique combination of spatial and spectral features, but given the high dimensionality of HSI datasets, extracting and utilising them to segment images is non-trivial. Here, we show how networks based on deep convolutional autoencoders (CAEs) can perform this task in an end-to-end fashion by first detecting and compressing relevant features from patches of the HSI into low-dimensional latent vectors, and then performing a clustering step that groups patches containing similar spatio-spectral features together. We showcase the advantages of using this end-to-end spatio-spectral segmentation approach compared to i) the same spatio-spectral technique not trained in an end-to-end manner, and ii) a method that only utilises spectral features (spectral k-means) using simulated HSIs of porcine tissue as test examples. Secondly, we describe the potential advantages/limitations of using three different CAE architectures: a generic 2D CAE, a generic 3D CAE, and a 2D CNN architecture inspired by the recently proposed UwU-net that is specialised for extracting features from HSI data. We assess their performance on IR HSIs of real colon samples. We find that all architectures are capable of producing segmentations that show good correspondence with HE stained adjacent tissue slices used as approximate ground truths, indicating the robustness of the CAE-driven approach for segmenting biomedical HSI data. Additionally, we stress the need for more accurate ground truth information to rigorously compare the advantages offered by each architecture.

下载PDF全文

下载文献需遵守相关版权规定

论文标题