论文标题

使用预训练的卷积神经网络对平衡和不平衡癌症数据集的性能比较

Performance Comparison of Balanced and Unbalanced Cancer Datasets using Pre-Trained Convolutional Neural Network

论文作者

Narin, Ali

论文摘要

癌症疾病是全世界死亡的主要原因之一。乳腺癌是一种常见的癌症,尤其是在女性中,很普遍。用于早期检测这种癌症类型的最重要的工具,它需要长时间建立确定性诊断的过程,是活检拍摄的组织病理学图像。这些获得的图像由病理学家检查,并做出明确的诊断。在计算机的帮助下检测此过程非常普遍。文献中发生了良性或恶性肿瘤的检测,尤其是通过使用不同放大率的数据。在这项研究中,通过使用Breakhis数据集中的组织病理学数据形成了两个不同的平衡和不平衡研究组。我们已经研究了平衡和不平衡数据集的性能如何在检测肿瘤类型时发生变化。总之,在使用InceptionV3卷积神经网络模型进行的研究中,已经获得了平衡数据的93.55%的精度,99.19%的召回和87.10%的特异性值,而89.75%的精度,82.89%的召回率和91.51%的特异性值获得了不平衡数据的特异性。根据两项不同研究中获得的结果,数据的平衡增加了整体性能以及良性和恶性肿瘤的检测性能。可以说,以平衡方式创建的数据集训练的模型将为病理专家提供更高,准确的结果。

Cancer disease is one of the leading causes of death all over the world. Breast cancer, which is a common cancer disease especially in women, is quite common. The most important tool used for early detection of this cancer type, which requires a long process to establish a definitive diagnosis, is histopathological images taken by biopsy. These obtained images are examined by pathologists and a definitive diagnosis is made. It is quite common to detect this process with the help of a computer. Detection of benign or malignant tumors, especially by using data with different magnification rates, takes place in the literature. In this study, two different balanced and unbalanced study groups have been formed by using the histopathological data in the BreakHis data set. We have examined how the performances of balanced and unbalanced data sets change in detecting tumor type. In conclusion, in the study performed using the InceptionV3 convolution neural network model, 93.55% accuracy, 99.19% recall and 87.10% specificity values have been obtained for balanced data, while 89.75% accuracy, 82.89% recall and 91.51% specificity values have been obtained for unbalanced data. According to the results obtained in two different studies, the balance of the data increases the overall performance as well as the detection performance of both benign and malignant tumors. It can be said that the model trained with the help of data sets created in a balanced way will give pathology specialists higher and accurate results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源