改善图像到文本放射学报告的事实完整性和一致性

论文标题

改善图像到文本放射学报告的事实完整性和一致性

Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation

论文作者

Miura, Yasuhide, Zhang, Yuhao, Tsai, Emily Bao, Langlotz, Curtis P., Jurafsky, Dan

论文摘要

神经图像到文本放射学报告生成系统通过减少报告起草的重复过程并识别可能的医疗错误，从而有可能改善放射学报告。但是，尽管在苹果酒或BLEU等自然语言产生指标上取得了高度的表现，但现有的报告生成系统仍然患有不完整和不一致的世代。在这里，我们介绍了两个新的简单奖励，以鼓励实际上完整，一致的放射学报告：一种鼓励系统生成与参考一致的放射学领域实体，一种使用自然语言推断鼓励以推论一致的方式来描述这些实体。我们将它们与现有语义等效度量的新颖使用（bertscore）结合使用。我们进一步提出了一种报告生成系统，该系统通过加强学习来优化这些奖励。在两个开放放射学报告数据集上，我们的系统将临床信息提取性能的F1得分大大提高了+22.1（Delta +63.9％）。我们通过人类评估和定性分析进一步展示，与基准相比，我们的系统导致了几代人在实际上更完整和一致的世代。

Neural image-to-text radiology report generation systems offer the potential to improve radiology reporting by reducing the repetitive process of report drafting and identifying possible medical errors. However, existing report generation systems, despite achieving high performances on natural language generation metrics such as CIDEr or BLEU, still suffer from incomplete and inconsistent generations. Here we introduce two new simple rewards to encourage the generation of factually complete and consistent radiology reports: one that encourages the system to generate radiology domain entities consistent with the reference, and one that uses natural language inference to encourage these entities to be described in inferentially consistent ways. We combine these with the novel use of an existing semantic equivalence metric (BERTScore). We further propose a report generation system that optimizes these rewards via reinforcement learning. On two open radiology report datasets, our system substantially improved the F1 score of a clinical information extraction performance by +22.1 (Delta +63.9%). We further show via a human evaluation and a qualitative analysis that our system leads to generations that are more factually complete and consistent compared to the baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题