解释在神经机器翻译中利用代码结构的软件错误

论文标题

解释在神经机器翻译中利用代码结构的软件错误

Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation

论文作者

Mahbub, Parvez, Shuvo, Ohiduzzaman, Rahman, Mohammad Masudur

论文摘要

软件错误声称约有50％的开发时间，而全球经济损失了数十亿美元。报告错误后，分配的开发人员将尝试识别和理解负责该错误的源代码，然后纠正代码。在过去的五十年中，关于自动查找或纠正软件错误的大量研究。但是，几乎没有关于自动向开发人员解释错误的研究，这是必不可少的，但这是一项艰巨的任务。在本文中，我们提出了一个基于变压器的生成模型BugSplainer，该模型通过从大量的Bug-Fix提交中学习来生成软件错误的自然语言解释。 BugSplainer可以利用源代码中的结构信息和错误模式来生成错误的说明。我们使用三个性能指标的评估表明，Bugsplainer可以根据Google的标准产生易于理解的解释，并且可以超越文献中的多个基线。我们还进行了一项涉及20名参与者的开发人员研究，发现Bugsplainer的解释比基线更准确，更精确，更简洁，更有用。

Software bugs claim approximately 50% of development time and cost the global economy billions of dollars. Once a bug is reported, the assigned developer attempts to identify and understand the source code responsible for the bug and then corrects the code. Over the last five decades, there has been significant research on automatically finding or correcting software bugs. However, there has been little research on automatically explaining the bugs to the developers, which is essential but a highly challenging task. In this paper, we propose Bugsplainer, a transformer-based generative model, that generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits. Bugsplainer can leverage structural information and buggy patterns from the source code to generate an explanation for a bug. Our evaluation using three performance metrics shows that Bugsplainer can generate understandable and good explanations according to Google's standard, and can outperform multiple baselines from the literature. We also conduct a developer study involving 20 participants where the explanations from Bugsplainer were found to be more accurate, more precise, more concise and more useful than the baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题