论文标题
I-Siaimids:改进的暹罗ID,用于处理基于网络的入侵检测系统中的类失衡
I-SiamIDS: an improved Siam-IDS for handling class imbalance in network-based intrusion detection systems
论文作者
论文摘要
NIDSS通过分析网络流量来确定恶意活动。 NIDS经过良性和侵入性网络流量的样本培训。根据可用实例的数量,培训样本属于多数或少数族裔。多数类包括用于正常流量的丰富样本以及反复入侵的样本。而少数族裔类别的样本较少,用于未知事件或不经常入侵的样本。接受过这种不平衡数据训练的NIDS倾向于对少数族裔攻击类别产生偏见的预测,从而导致未发现或错误分类的入侵。过去的研究工作使用数据级方法来解决此类的不平衡问题,这些方法可以增加少数群体样本或减少培训数据集中的多数类样本。尽管这些数据级平衡方法间接提高了NIDS的性能,但它们并未解决NIDSS中的潜在问题,即他们无法识别仅具有有限的培训数据的攻击。本文提出了一种称为i-SiaiMids的算法级别方法,这是处理类不平衡问题的两层合奏。 I-Siamids在算法级别上同时识别多数和少数类别,而无需使用任何数据级平衡技术。 i-SiaiMids的第一层使用B-XGBOOST,SIAMESE-NN和DNN的集合来层次过滤输入样品,以识别攻击。然后将这些攻击发送到使用M-XGBoost的第二层I-Siamids,以将其分类为不同的攻击类。与其对应物相比,I-Siamids在NSL-KDD和CIDDS-001数据集的准确性,召回,精度,F1得分和AUC值方面显示出显着改善。为了进一步加强结果,还进行了计算成本分析以研究所提出的I-SiaiMIDS的可接受性。
NIDSs identify malicious activities by analyzing network traffic. NIDSs are trained with the samples of benign and intrusive network traffic. Training samples belong to either majority or minority classes depending upon the number of available instances. Majority classes consist of abundant samples for the normal traffic as well as for recurrent intrusions. Whereas, minority classes include fewer samples for unknown events or infrequent intrusions. NIDSs trained on such imbalanced data tend to give biased predictions against minority attack classes, causing undetected or misclassified intrusions. Past research works handled this class imbalance problem using data-level approaches that either increase minority class samples or decrease majority class samples in the training data set. Although these data-level balancing approaches indirectly improve the performance of NIDSs, they do not address the underlying issue in NIDSs i.e. they are unable to identify attacks having limited training data only. This paper proposes an algorithm-level approach called I-SiamIDS, which is a two-layer ensemble for handling class imbalance problem. I-SiamIDS identifies both majority and minority classes at the algorithm-level without using any data-level balancing techniques. The first layer of I-SiamIDS uses an ensemble of b-XGBoost, Siamese-NN and DNN for hierarchical filtration of input samples to identify attacks. These attacks are then sent to the second layer of I-SiamIDS for classification into different attack classes using m-XGBoost. As compared to its counterparts, I-SiamIDS showed significant improvement in terms of Accuracy, Recall, Precision, F1-score and values of AUC for both NSL-KDD and CIDDS-001 datasets. To further strengthen the results, computational cost analysis was also performed to study the acceptability of the proposed I-SiamIDS.