论文标题
有效删除操作日志消息:模型推理的应用程序
Effective Removal of Operational Log Messages: an Application to Model Inference
论文作者
论文摘要
模型推理旨在从软件系统的执行日志中提取准确的模型。但是,实际上,日志可能包含一些“噪声”,这些噪声可能会恶化模型推断的性能。通常可以在系统日志中找到一种噪声形式,这些噪声不仅包含交易消息,还可以记录系统的功能行为---而且运行消息 - - 记录系统的操作状态(例如,定期心跳以跟踪记忆使用情况)。在低质量的日志中,交易和操作消息是随机交织的,导致将操作行为错误纳入系统模型,理想情况下只能反映系统的功能行为。因此,重要的是在推断模型之前删除日志中的操作消息。在本文中,我们提出了LogCleaner,这是一种用于删除操作日志消息的新技术。 LogCheaner首先执行周期性分析以滤除周期性消息,然后执行依赖项分析以计算所有日志消息的依赖程度,并根据其依赖项删除操作消息。两个专有和11个公开的日志数据集的实验结果表明,LogCleaner平均可以准确删除98%的操作消息,并保留81%的交易消息。此外,使用loglecleaner预先处理的日志减少了模型推理的执行时间(速度的速度从1.5到946.7,具体取决于系统特征的不同),并显着提高了被推断模型的准确性,通过增强其对正确的系统行为的能力(+43.8 pp in+pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp = pp)。 平均的)。
Model inference aims to extract accurate models from the execution logs of software systems. However, in reality, logs may contain some "noise" that could deteriorate the performance of model inference. One form of noise can commonly be found in system logs that contain not only transactional messages---logging the functional behavior of the system---but also operational messages---recording the operational state of the system (e.g., a periodic heartbeat to keep track of the memory usage). In low-quality logs, transactional and operational messages are randomly interleaved, leading to the erroneous inclusion of operational behaviors into a system model, that ideally should only reflect the functional behavior of the system. It is therefore important to remove operational messages in the logs before inferring models. In this paper, we propose LogCleaner, a novel technique for removing operational logs messages. LogCleaner first performs a periodicity analysis to filter out periodic messages, and then it performs a dependency analysis to calculate the degree of dependency for all log messages and to remove operational messages based on their dependencies. The experimental results on two proprietary and 11 publicly available log datasets show that LogCleaner, on average, can accurately remove 98% of the operational messages and preserve 81% of the transactional messages. Furthermore, using logs pre-processed with LogCleaner decreases the execution time of model inference (with a speed-up ranging from 1.5 to 946.7 depending on the characteristics of the system) and significantly improves the accuracy of the inferred models, by increasing their ability to accept correct system behaviors (+43.8 pp on average, with pp=percentage points) and to reject incorrect system behaviors (+15.0 pp on average).