论文标题
通过相关性图的互动可视化改善3D卷积神经网络的可理解性:阿尔茨海默氏病的评估
Improving 3D convolutional neural network comprehensibility via interactive visualization of relevance maps: Evaluation in Alzheimer's disease
论文作者
论文摘要
背景:尽管卷积神经网络(CNN)具有高诊断精度,用于基于磁共振成像(MRI)扫描来检测阿尔茨海默氏病(AD)痴呆,但它们尚未在临床常规中应用。造成这种情况的一个重要原因是缺乏模型可理解性。最近开发的用于得出CNN相关图的可视化方法可能有助于填补这一空白。我们调查了具有较高准确性的模型是否还更多地依赖于歧视性的大脑区域,这是由先验知识所预见的。 方法:我们培训了CNN在N = 663 T1加权的MRI扫描中检测AD的痴呆症患者和敏感性轻度认知障碍(MCI),并通过交叉验证和包括n = 1655例(N = 1655例)的三个独立样品中验证了模型的准确性。我们评估了相关得分和海马体积的关联,以验证该方法的临床实用性。为了提高模型的可理解性,我们实施了3D CNN相关图的交互式可视化。 结果:在三个独立的数据集中,组分离显示出高度准确性的AD痴呆症与控件(AUC $ \ geq $ 0.92),MCI与控件(AUC $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ 0.75)。相关图表明,海马萎缩被认为是AD检测的最有用的因素,并在其他皮质和皮层下区域造成了额外的贡献。海马内的相关性得分与海马体积高度相关(Pearson's R $ \ $ -0.86,p <0.001)。 结论:相关性图突出了我们假设先验的地区萎缩。这加强了CNN模型的可理解性,CNN模型是根据扫描和诊断标签以纯粹数据驱动方式训练的。
Background: Although convolutional neural networks (CNN) achieve high diagnostic accuracy for detecting Alzheimer's disease (AD) dementia based on magnetic resonance imaging (MRI) scans, they are not yet applied in clinical routine. One important reason for this is a lack of model comprehensibility. Recently developed visualization methods for deriving CNN relevance maps may help to fill this gap. We investigated whether models with higher accuracy also rely more on discriminative brain regions predefined by prior knowledge. Methods: We trained a CNN for the detection of AD in N=663 T1-weighted MRI scans of patients with dementia and amnestic mild cognitive impairment (MCI) and verified the accuracy of the models via cross-validation and in three independent samples including N=1655 cases. We evaluated the association of relevance scores and hippocampus volume to validate the clinical utility of this approach. To improve model comprehensibility, we implemented an interactive visualization of 3D CNN relevance maps. Results: Across three independent datasets, group separation showed high accuracy for AD dementia vs. controls (AUC$\geq$0.92) and moderate accuracy for MCI vs. controls (AUC$\approx$0.75). Relevance maps indicated that hippocampal atrophy was considered as the most informative factor for AD detection, with additional contributions from atrophy in other cortical and subcortical regions. Relevance scores within the hippocampus were highly correlated with hippocampal volumes (Pearson's r$\approx$-0.86, p<0.001). Conclusion: The relevance maps highlighted atrophy in regions that we had hypothesized a priori. This strengthens the comprehensibility of the CNN models, which were trained in a purely data-driven manner based on the scans and diagnosis labels.