在神经语言模型中高估句法表示

论文标题

在神经语言模型中高估句法表示

Overestimation of Syntactic Representationin Neural Language Models

论文作者

Kodner, Jordan, Gupta, Nitish

论文摘要

在过去的几年中，随着强大的神经语言模型的出现，研究的关注越来越集中在他们所代表的语言方面使它们如此成功。已经开发了几种测试方法来探测模型的句法表示。确定模型诱导句法结构的能力的一种流行方法是根据模板生成的字符串训练模型，然后测试模型将此类字符串与具有不同语法的表面相似的字符串区分开的能力。我们通过重现了最新论文的阳性结果来说明一种基本问题，该论文具有两个非句法的基线语言模型：N-gram模型和一个对拼命输入的LSTM模型。

With the advent of powerful neural language models over the last few years, research attention has increasingly focused on what aspects of language they represent that make them so successful. Several testing methodologies have been developed to probe models' syntactic representations. One popular method for determining a model's ability to induce syntactic structure trains a model on strings generated according to a template then tests the model's ability to distinguish such strings from superficially similar ones with different syntax. We illustrate a fundamental problem with this approach by reproducing positive results from a recent paper with two non-syntactic baseline language models: an n-gram model and an LSTM model trained on scrambled inputs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题