论文标题
迈向意义的热力学
Toward a Thermodynamics of Meaning
论文作者
论文摘要
随着诸如GPT-3之类的语言模型在生成现实文本方面越来越成功,有关纯粹基于文本的建模可以了解世界的问题变得更加紧迫。正如怀疑论者所论证的那样,文字纯粹是句法吗?还是实际上包含了一些语义信息,而这些语义信息可以使用足够复杂的语言模型来了解世界而没有任何其他输入?本文描述了一种新模型,该模型对这些问题提出了一些合格的答案。通过理论化文本与世界之间的关系,它描述为热力学系统与更大的水库之间的平衡关系,本文认为,即使是非常简单的语言模型也确实会学习有关世界的结构性事实,同时也提出了对这些事实的性质和程度相对精确的限制。这种观点不仅有望回答有关语言模型实际学习的问题,而且还可以解释同时发生预测作为AI中意义的策略的一致和令人惊讶的成功。
As language models such as GPT-3 become increasingly successful at generating realistic text, questions about what purely text-based modeling can learn about the world have become more urgent. Is text purely syntactic, as skeptics argue? Or does it in fact contain some semantic information that a sufficiently sophisticated language model could use to learn about the world without any additional inputs? This paper describes a new model that suggests some qualified answers to those questions. By theorizing the relationship between text and the world it describes as an equilibrium relationship between a thermodynamic system and a much larger reservoir, this paper argues that even very simple language models do learn structural facts about the world, while also proposing relatively precise limits on the nature and extent of those facts. This perspective promises not only to answer questions about what language models actually learn, but also to explain the consistent and surprising success of cooccurrence prediction as a meaning-making strategy in AI.