论文标题

部分可观测时空混沌系统的无模型预测

Generalizability of Code Clone Detection on CodeBERT

论文作者

Sonnekalb, Tim, Gruner, Bernd, Brust, Clemens-Alexander, Mäder, Patrick

论文摘要

诸如Codebert之类的变压器网络已经在基准数据集中的代码克隆检测方面取得了出色的结果,因此可以假设已经解决了此任务。但是,代码克隆检测并不是一项琐碎的任务。尤其是语义代码克隆,要检测到具有挑战性。我们表明,Codebert的普遍性通过评估BigCloneBench的两个不同子集的Java代码克隆来降低。当我们评估与模型构建不同的代码段和功能ID时,我们会观察到F1分数的显着下降。

Transformer networks such as CodeBERT already achieve outstanding results for code clone detection in benchmark datasets, so one could assume that this task has already been solved. However, code clone detection is not a trivial task. Semantic code clones, in particular, are challenging to detect. We show that the generalizability of CodeBERT decreases by evaluating two different subsets of Java code clones from BigCloneBench. We observe a significant drop in F1 score when we evaluate different code snippets and functionality IDs than those used for model building.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源