组装网络范围以评估人工智能 /机器学习（AI / ML）安全工具

论文标题

组装网络范围以评估人工智能 /机器学习（AI / ML）安全工具

Assembling a Cyber Range to Evaluate Artificial Intelligence / Machine Learning (AI/ML) Security Tools

论文作者

Nichols, Jeffrey A., Spakes, Kevin D., Watson, Cory L., Bridges, Robert A.

论文摘要

在此案例研究中，我们描述了在美国田纳西州橡树岭的橡树岭国家实验室的网络安全测试床的设计和组装。该范围旨在提供敏捷的重新配置，以促进各种实验，以评估网络安全工具，尤其是涉及AI/ML的工具。特别是，测试床提供了逼真的测试环境，同时允许在实验过程中控制和编程观察/数据收集。我们设计了重复评估的能力，因此可以在以后的时间进行评估和比较。该系统是一个可以向上或向下缩放的实验大小的系统。会议时，我们将在此范围内完成两个全面的国家政府挑战。这些挑战是评估基于AI/ML的网络安全工具的性能和运营成本，以应用于大型，政府大小的网络。这些评估将被描述为为我们做出的各种设计决策和改编提供动力和背景的例子。第一个挑战测量了针对各种文件类型中选择的100k文件样本（良性软件和恶意软件）的终点安全工具。第二个是对网络入侵检测系统在识别多步对抗运动（涉及侦察，穿透和剥削，横向移动等）方面的功效的评估，并且在高量业务网络中的隐蔽性水平不同。每个挑战的规模都需要自动化系统重复或同时反映了正在测试的每个ML工具的实验。提供一系列易于缺乏的恶意活动来唤起AI/ML工具的真实能力，这是设计和执行这些挑战事件的一个特别有趣且充满挑战的方面。

In this case study, we describe the design and assembly of a cyber security testbed at Oak Ridge National Laboratory in Oak Ridge, TN, USA. The range is designed to provide agile reconfigurations to facilitate a wide variety of experiments for evaluations of cyber security tools -- particularly those involving AI/ML. In particular, the testbed provides realistic test environments while permitting control and programmatic observations/data collection during the experiments. We have designed in the ability to repeat the evaluations, so additional tools can be evaluated and compared at a later time. The system is one that can be scaled up or down for experiment sizes. At the time of the conference we will have completed two full-scale, national, government challenges on this range. These challenges are evaluating the performance and operating costs for AI/ML-based cyber security tools for application into large, government-sized networks. These evaluations will be described as examples providing motivation and context for various design decisions and adaptations we have made. The first challenge measured end-point security tools against 100K file samples (benignware and malware) chosen across a range of file types. The second is an evaluation of network intrusion detection systems efficacy in identifying multi-step adversarial campaigns -- involving reconnaissance, penetration and exploitations, lateral movement, etc. -- with varying levels of covertness in a high-volume business network. The scale of each of these challenges requires automation systems to repeat, or simultaneously mirror identical the experiments for each ML tool under test. Providing an array of easy-to-difficult malicious activity for sussing out the true abilities of the AI/ML tools has been a particularly interesting and challenging aspect of designing and executing these challenge events.

下载PDF全文

下载文献需遵守相关版权规定

论文标题