论文标题

CONCRA:卷积神经网络代码检索方法

CoNCRA: A Convolutional Neural Network Code Retrieval Approach

论文作者

Martins, Marcelo de Rezende, Gerosa, Marco A.

论文摘要

软件开发人员通常使用通用搜索引擎通常搜索代码。但是,除非具有随附的描述,否则这些搜索引擎无法以语义找到代码。我们提出了一种用于语义代码搜索的技术:一种代码检索的卷积神经网络方法(CONCRA)。我们的技术旨在找到与开发人员用自然语言表达的开发人员意图最匹配的代码段。我们在由堆栈溢出收集的问题和代码段组成的数据集上评估了方法的功效。我们的初步结果表明,我们的技术优先考虑本地互动(附近的单词),将最新的(SOTA)平均提高了5%,在前3(三个)位置中最相关的代码片段的最相关代码片段近80%。因此,我们的技术是有希望的,可以提高语义代码检索的功效。

Software developers routinely search for code using general-purpose search engines. However, these search engines cannot find code semantically unless it has an accompanying description. We propose a technique for semantic code search: A Convolutional Neural Network approach to code retrieval (CoNCRA). Our technique aims to find the code snippet that most closely matches the developer's intent, expressed in natural language. We evaluated our approach's efficacy on a dataset composed of questions and code snippets collected from Stack Overflow. Our preliminary results showed that our technique, which prioritizes local interactions (words nearby), improved the state-of-the-art (SOTA) by 5% on average, retrieving the most relevant code snippets in the top 3 (three) positions by almost 80% of the time. Therefore, our technique is promising and can improve the efficacy of semantic code retrieval.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源