论文标题

证明:通过深厚的强化学习对APT威胁的投影

ProAPT: Projection of APT Threats with Deep Reinforcement Learning

论文作者

Dehghan, Motahareh, Sadeghiyan, Babak, Khosravian, Erfan, Moghaddam, Alireza Sedighi, Nooshi, Farshid

论文摘要

当预测不久的将来的环境中的要素状态时,Endley情况意识模型的最高级别称为投影。在网络安全状况的意识中,对先进持久威胁(APT)的预测需要预测公寓的下一步。威胁正在不断变化并变得越来越复杂。由于受监督和无监督的学习方法需要APT数据集​​来投影APT的下一步,因此他们无法识别未知的APT威胁。在强化学习方法中,代理与环境相互作用,因此它可能会投射出已知和未知的APT的下一步。到目前为止,尚未使用强化学习来计划APTS的下一步。在强化学习中,代理商使用先前的状态和行动来近似当前状态的最佳动作。当状态和行动的数量丰富时,代理人采用神经网络,该网络被称为深度学习来近似每个州的最佳动作。在本文中,我们提出了一个深入的增强学习系统,以预测APT的下一步。由于攻击步骤之间存在某种关系,我们采用长期短期记忆(LSTM)方法来近似每个州的最佳动作。在我们提出的系统中,根据当前情况,我们将投射出APT威胁的下一步。

The highest level in the Endsley situation awareness model is called projection when the status of elements in the environment in the near future is predicted. In cybersecurity situation awareness, the projection for an Advanced Persistent Threat (APT) requires predicting the next step of the APT. The threats are constantly changing and becoming more complex. As supervised and unsupervised learning methods require APT datasets for projecting the next step of APTs, they are unable to identify unknown APT threats. In reinforcement learning methods, the agent interacts with the environment, and so it might project the next step of known and unknown APTs. So far, reinforcement learning has not been used to project the next step for APTs. In reinforcement learning, the agent uses the previous states and actions to approximate the best action of the current state. When the number of states and actions is abundant, the agent employs a neural network which is called deep learning to approximate the best action of each state. In this paper, we present a deep reinforcement learning system to project the next step of APTs. As there exists some relation between attack steps, we employ the Long- Short-Term Memory (LSTM) method to approximate the best action of each state. In our proposed system, based on the current situation, we project the next steps of APT threats.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源