通过深厚的增强学习来控制无意识的水平

论文标题

通过深厚的增强学习来控制无意识的水平

Controlling Level of Unconsciousness by Titrating Propofol with Deep Reinforcement Learning

论文作者

Schamberg, Gabe, Badgeley, Marcus, Brown, Emery N.

论文摘要

加固学习（RL）可用于适合从患者状态到药物方案的映射。先前的研究使用了确定性和基于价值的表格学习来从观察到的麻醉状态中学习丙泊剂量。 Deep RL用深层神经网络代替了表，并已用于从注册表数据库中学习药物治疗方案。在这里，我们在模拟环境中执行深入RL对麻醉剂量的闭环控制。我们使用跨凝性方法来训练深神网络，以将观察到的麻醉状态映射到注入固定丙泊剂量的概率上。在测试过程中，我们实施了确定性政策，将输液的可能性转化为连续输注率。该模型对具有随机参数的模拟药代动力学/药效学模型进行了训练和测试，以确保患者变异性稳健。深RL代理显着优于比例综合衍生控制器（中值绝对性能误差1.7％+/- 0.6和3.4％+/- 1.2）。建模连续输入变量而不是表可以提供更强大的模式识别，并利用我们先前的域知识。 Deep RL通过对数据科学家和麻醉护理提供者的自然解释学习了平稳的政策。

Reinforcement Learning (RL) can be used to fit a mapping from patient state to a medication regimen. Prior studies have used deterministic and value-based tabular learning to learn a propofol dose from an observed anesthetic state. Deep RL replaces the table with a deep neural network and has been used to learn medication regimens from registry databases. Here we perform the first application of deep RL to closed-loop control of anesthetic dosing in a simulated environment. We use the cross-entropy method to train a deep neural network to map an observed anesthetic state to a probability of infusing a fixed propofol dosage. During testing, we implement a deterministic policy that transforms the probability of infusion to a continuous infusion rate. The model is trained and tested on simulated pharmacokinetic/pharmacodynamic models with randomized parameters to ensure robustness to patient variability. The deep RL agent significantly outperformed a proportional-integral-derivative controller (median absolute performance error 1.7% +/- 0.6 and 3.4% +/- 1.2). Modeling continuous input variables instead of a table affords more robust pattern recognition and utilizes our prior domain knowledge. Deep RL learned a smooth policy with a natural interpretation to data scientists and anesthesia care providers alike.

下载PDF全文

下载文献需遵守相关版权规定

论文标题