按结构排序：语言模型排名为依赖项探测

论文标题

按结构排序：语言模型排名为依赖项探测

Sort by Structure: Language Model Ranking as Dependency Probing

论文作者

Müller-Eberstein, Max, van der Goot, Rob, Plank, Barbara

论文摘要

对预训练的语言模型（LM）做出明智的选择对于性能至关重要，但环境成本高昂，并且如此广泛地被忽视。计算机视觉领域已经开始解决编码器排名，并有希望地进入自然语言处理，但是它们缺乏对结构化预测等语言任务的覆盖范围。我们建议通过测量可以从LM的上下文化嵌入中恢复标记的树的程度来对LMS进行排名，特别是针对给定语言的解析依赖性。在46个类型和建筑上不同的LM语言对中，我们的探测方法可以预测，最佳的LM选择在79％的时间内使用比训练完整解析器的数量级的订单较少。在这项研究中，我们识别并分析了最近提出的脱钩LM -rembert-，发现它包含的固有依赖信息较少，但经过完整的微调后通常会产生最好的解析器。没有这个异常，我们的方法将在89％的情况下确定最佳的LM。

Making an informed choice of pre-trained language model (LM) is critical for performance, yet environmentally costly, and as such widely underexplored. The field of Computer Vision has begun to tackle encoder ranking, with promising forays into Natural Language Processing, however they lack coverage of linguistic tasks such as structured prediction. We propose probing to rank LMs, specifically for parsing dependencies in a given language, by measuring the degree to which labeled trees are recoverable from an LM's contextualized embeddings. Across 46 typologically and architecturally diverse LM-language pairs, our probing approach predicts the best LM choice 79% of the time using orders of magnitude less compute than training a full parser. Within this study, we identify and analyze one recently proposed decoupled LM - RemBERT - and find it strikingly contains less inherent dependency information, but often yields the best parser after full fine-tuning. Without this outlier our approach identifies the best LM in 89% of cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题