论文标题
词汇和语法处理对从自然语言生成代码的影响
The impact of lexical and grammatical processing on generating code from natural language
论文作者
论文摘要
考虑到自然语言的Tranx的SEQ2SEQ架构来代码翻译,我们确定了重要性的四个关键组成部分:语法约束,词汇预处理,输入表示和复制机制。为了研究这些组件的影响,我们使用了依赖Bert编码器和基于语法的解码器的最先进的体系结构,为其提供了形式化。本文强调了当前自然语言中词汇替代部分对代码系统的重要性。
Considering the seq2seq architecture of TranX for natural language to code translation, we identify four key components of importance: grammatical constraints, lexical preprocessing, input representations, and copy mechanisms. To study the impact of these components, we use a state-of-the-art architecture that relies on BERT encoder and a grammar-based decoder for which a formalization is provided. The paper highlights the importance of the lexical substitution component in the current natural language to code systems.