论文标题
部分可观测时空混沌系统的无模型预测
Generalizability of Code Clone Detection on CodeBERT
论文作者
论文摘要
诸如Codebert之类的变压器网络已经在基准数据集中的代码克隆检测方面取得了出色的结果,因此可以假设已经解决了此任务。但是,代码克隆检测并不是一项琐碎的任务。尤其是语义代码克隆,要检测到具有挑战性。我们表明,Codebert的普遍性通过评估BigCloneBench的两个不同子集的Java代码克隆来降低。当我们评估与模型构建不同的代码段和功能ID时,我们会观察到F1分数的显着下降。
Transformer networks such as CodeBERT already achieve outstanding results for code clone detection in benchmark datasets, so one could assume that this task has already been solved. However, code clone detection is not a trivial task. Semantic code clones, in particular, are challenging to detect. We show that the generalizability of CodeBERT decreases by evaluating two different subsets of Java code clones from BigCloneBench. We observe a significant drop in F1 score when we evaluate different code snippets and functionality IDs than those used for model building.