一种基于不同数据集的新颖多个合奏学习模型，用于软件缺陷预测

论文标题

一种基于不同数据集的新颖多个合奏学习模型，用于软件缺陷预测

A Novel Multiple Ensemble Learning Models Based on Different Datasets for Software Defect Prediction

论文作者

Nawaz, Ali, Rehman, Attique Ur, Abbas, Muhammad

论文摘要

软件测试是确保软件质量的重要方法之一。发现测试成本超过整体项目成本的50％。有效而有效的软件测试利用软件的最低资源。因此，构建程序不仅能够执行有效的测试，而且可以最大程度地减少项目资源的利用率，这一点很重要。软件测试的目的是在软件系统中找到最大的缺陷。该软件中发现的缺陷越多，可以确保提出更多的软件测试，以检测软件中的缺陷并利用资源并取得良好的结果。随着世界不断朝着数据驱动的方法迈进，以做出重要的决策。因此，在本研究论文中，我们对公开可用数据集进行了机器学习分析，并试图实现最高准确性。本文的主要重点是在数据集上应用不同的机器学习技术，并找出哪种技术产生有效的结果。特别是，我们提出了一个集合学习模型，并在不同数据集上对KNN，决策树，SVM和幼稚的贝叶斯进行了比较分析，并且证明合奏方法的性能比准确性，精度，召回和F1得分的其他方法更重要。在CM1训练的合奏模型的分类精度为98.56％，在KM2上训练的合奏模型的分类精度为98.18％，在PC1上训练的集合学习模型的分类精度为99.27％。这表明合奏是比较其他技术的缺陷预测更有效的方法。

Software testing is one of the important ways to ensure the quality of software. It is found that testing cost more than 50% of overall project cost. Effective and efficient software testing utilizes the minimum resources of software. Therefore, it is important to construct the procedure which is not only able to perform the efficient testing but also minimizes the utilization of project resources. The goal of software testing is to find maximum defects in the software system. More the defects found in the software ensure more efficiency is the software testing Different techniques have been proposed to detect the defects in software and to utilize the resources and achieve good results. As world is continuously moving toward data driven approach for making important decision. Therefore, in this research paper we performed the machine learning analysis on the publicly available datasets and tried to achieve the maximum accuracy. The major focus of the paper is to apply different machine learning techniques on the datasets and find out which technique produce efficient result. Particularly, we proposed an ensemble learning models and perform comparative analysis among KNN, Decision tree, SVM and Naïve Bayes on different datasets and it is demonstrated that performance of Ensemble method is more than other methods in term of accuracy, precision, recall and F1-score. The classification accuracy of ensemble model trained on CM1 is 98.56%, classification accuracy of ensemble model trained on KM2 is 98.18% similarly, the classification accuracy of ensemble learning model trained on PC1 is 99.27%. This reveals that Ensemble is more efficient method for making the defect prediction as compared other techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题