CC2VEC：代码更改的分布式表示

论文标题

CC2VEC：代码更改的分布式表示

CC2Vec: Distributed Representations of Code Changes

论文作者

Hoang, Thong, Kang, Hong Jin, Lawall, Julia, Lo, David

论文摘要

在软件补丁上的现有工作通常使用特定于单个任务的功能。这些作品通常依赖于手动确定的功能，并且需要人类的努力来确定每个任务的这些功能。在这项工作中，我们提出了CC2VEC，这是一种神经网络模型，该模型学习代码在附带的日志消息的指导下的代码更改的表示，该消息代表代码的语义意图。 CC2VEC在注意机制的帮助下对代码更改的层次结构进行建模，并使用多个比较功能来确定删除代码和添加的代码之间的差异。为了评估CC2VEC是否可以产生代码更改的分布式表示形式，该代码更改对软件补丁的多个任务有用，我们使用CC2VEC生产的向量进行三个任务：日志消息生成，错误修复补丁补丁识别和仅在时间缺陷预测中。在所有任务中，使用CC2VEC的模型优于最新技术。

Existing work on software patches often use features specific to a single task. These works often rely on manually identified features, and human effort is required to identify these features for each task. In this work, we propose CC2Vec, a neural network model that learns a representation of code changes guided by their accompanying log messages, which represent the semantic intent of the code changes. CC2Vec models the hierarchical structure of a code change with the help of the attention mechanism and uses multiple comparison functions to identify the differences between the removed and added code. To evaluate if CC2Vec can produce a distributed representation of code changes that is general and useful for multiple tasks on software patches, we use the vectors produced by CC2Vec for three tasks: log message generation, bug fixing patch identification, and just-in-time defect prediction. In all tasks, the models using CC2Vec outperform the state-of-the-art techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题