论文标题
非结构化病理报告的分层深度学习分类,以使ICD-O形态分级自动化
Hierarchical Deep Learning Classification of Unstructured Pathology Reports to Automate ICD-O Morphology Grading
论文作者
论文摘要
需要及时的癌症报告数据才能了解癌症的影响,向公共卫生资源计划的影响,并实施癌症政策,尤其是在撒哈拉以南非洲,报告滞后是世界平均落后的。含有肿瘤特定数据的非结构化病理报告是癌症登记所收集的主要信息来源。由于使用国际肿瘤学疾病分类(ICD-O)代码的手动处理和病理报告标记,由癌症注册管理人员使用的人类编码人员导致了癌症报告的相当大滞后。我们提出了一种层次深度学习分类方法,该方法采用卷积神经网络模型来自动化1813年的分类,并在9个类别中使用适用的ICD-O形态代码进行了匿名乳腺癌病理报告。我们证明,与平面多类CNN模型进行ICD-O形态分类相比,分层深度学习分类方法改善了性能。
Timely cancer reporting data are required in order to understand the impact of cancer, inform public health resource planning and implement cancer policy especially in Sub Saharan Africa where the reporting lag is behind world averages. Unstructured pathology reports, which contain tumor specific data, are the main source of information collected by cancer registries. Due to manual processing and labelling of pathology reports using the International Classification of Disease for oncology (ICD-O) codes, by human coders employed by cancer registries, has led to a considerable lag in cancer reporting. We present a hierarchical deep learning classification method that employs convolutional neural network models to automate the classification of 1813 anonymized breast cancer pathology reports with applicable ICD-O morphology codes across 9 classes. We demonstrate that the hierarchical deep learning classification method improves on performance in comparison to a flat multiclass CNN model for ICD-O morphology classification of the same reports.