论文标题
策略靶向前列腺活检的策略模板引导的针头放置
Strategising template-guided needle placement for MR-targeted prostate biopsy
论文作者
论文摘要
如果怀疑在术前磁共振(MR)图像中发现的可疑病变,则在超声引导的活检程序中,在超声引导的活检程序中有更好的机会进行采样的机会。但是,活检程序的诊断准确性受依赖性依赖性技能和取样目标的经验的限制,这是一个顺序决策过程,涉及导航超声探针并为潜在的多个目标放置一系列采样针头。这项工作旨在学习强化学习(RL)政策,以优化对指导模板连续定位2D超声视图和活检针的行为,以便可以对MR目标进行有效且充分的方式进行采样。我们首先将任务作为马尔可夫决策过程(MDP)制定,并构建一个环境,该环境可以根据其解剖结构和从MR图像得出的病变为单个患者执行靶向动作。因此,在每次活检程序之前,可以通过奖励MDP环境中的阳性采样来优化患者特定的政策。五十四名前列腺癌患者的实验结果表明,拟议的RL学习政策的平均命中率为93%,平均癌症核心长度为11 mm,与人类设计的两种替代基线策略相比,没有手工设计的奖励,而没有直接最大化这些临床上相关的指标。也许更有趣的是,发现RL代理商学习了适应病变大小的策略,在小病变中优先考虑针头的扩散。这种策略以前尚未在临床实践中报告或普遍采用,而是与直观设计的策略相比,导致了总体优越的靶向性能。
Clinically significant prostate cancer has a better chance to be sampled during ultrasound-guided biopsy procedures, if suspected lesions found in pre-operative magnetic resonance (MR) images are used as targets. However, the diagnostic accuracy of the biopsy procedure is limited by the operator-dependent skills and experience in sampling the targets, a sequential decision making process that involves navigating an ultrasound probe and placing a series of sampling needles for potentially multiple targets. This work aims to learn a reinforcement learning (RL) policy that optimises the actions of continuous positioning of 2D ultrasound views and biopsy needles with respect to a guiding template, such that the MR targets can be sampled efficiently and sufficiently. We first formulate the task as a Markov decision process (MDP) and construct an environment that allows the targeting actions to be performed virtually for individual patients, based on their anatomy and lesions derived from MR images. A patient-specific policy can thus be optimised, before each biopsy procedure, by rewarding positive sampling in the MDP environment. Experiment results from fifty four prostate cancer patients show that the proposed RL-learned policies obtained a mean hit rate of 93% and an average cancer core length of 11 mm, which compared favourably to two alternative baseline strategies designed by humans, without hand-engineered rewards that directly maximise these clinically relevant metrics. Perhaps more interestingly, it is found that the RL agents learned strategies that were adaptive to the lesion size, where spread of the needles was prioritised for smaller lesions. Such a strategy has not been previously reported or commonly adopted in clinical practice, but led to an overall superior targeting performance when compared with intuitively designed strategies.