论文标题

使用具有知识蒸馏的预训练的语言模型自动分配放射学检查协议

Automatic Assignment of Radiology Examination Protocols Using Pre-trained Language Models with Knowledge Distillation

论文作者

Lau, Wilson, Aaltonen, Laura, Gunn, Martin, Yetisgen, Meliha

论文摘要

选择放射学检查方案是一个重复性且耗时的过程。在本文中,我们提出了一种深度学习方法,可以通过预训练特定于域的BERT模型($ BERT_ {RAD} $)来自动将协议分配给计算机断层扫描检查。为了处理跨考试协议的高数据不平衡,我们使用了一种知识蒸馏方法,该方法通过数据扩大来提高少数群体类别。我们使用支持向量机(SVM),梯度提升机(GBM)和随机森林(RF)分类器以及Google的$ BERT_ {BASE {BASE} $模型,将所描述方法的分类性能与统计N-Gram模型进行了分类。 SVM,GBM和RF实现了0.45、0.45和0.6的宏观平均得分,而$ bert_ {base} $和$ bert_ {rad} $实现了0.61和0.63。知识蒸馏改善了少数群体的总体表现,达到0.66的F1得分。

Selecting radiology examination protocol is a repetitive, and time-consuming process. In this paper, we present a deep learning approach to automatically assign protocols to computer tomography examinations, by pre-training a domain-specific BERT model ($BERT_{rad}$). To handle the high data imbalance across exam protocols, we used a knowledge distillation approach that up-sampled the minority classes through data augmentation. We compared classification performance of the described approach with the statistical n-gram models using Support Vector Machine (SVM), Gradient Boosting Machine (GBM), and Random Forest (RF) classifiers, as well as the Google's $BERT_{base}$ model. SVM, GBM and RF achieved macro-averaged F1 scores of 0.45, 0.45, and 0.6 while $BERT_{base}$ and $BERT_{rad}$ achieved 0.61 and 0.63. Knowledge distillation improved overall performance on the minority classes, achieving a F1 score of 0.66.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源