多语言BERT的跨语性对准方法：比较研究

论文标题

多语言BERT的跨语性对准方法：比较研究

Cross-lingual Alignment Methods for Multilingual BERT: A Comparative Study

论文作者

Kulshreshtha, Saurabh, Redondo-García, José Luis, Chang, Ching-Yun

论文摘要

当在下游任务进行微调时，多语言BERT（MBERT）表现出合理的能力，可以进行零拍传输的合理能力。由于Mbert没有通过明确的跨语性监督进行预训练，因此可以通过将Mbert与跨语性信号对准可以进一步提高转移性能。先前的工作提出了几种使上下文化嵌入的方法。在本文中，我们分析了不同形式的跨语性监督和各种对齐方法如何影响米伯特在零射击环境中的转移能力。具体而言，我们比较了Parallel Corpora与基于词典的监督和旋转与基于微调的对准方法的比较。我们在两个任务上评估了跨八种语言的不同对齐方法的性能：名称实体识别和语义插槽填充。此外，我们提出了一种新颖的归一化方法，该方法始终提高基于旋转的对齐方式的性能，包括远处和类型上不同语言的3％F1改进。重要的是，我们确定了对齐方式对任务类型和与转移语言接近的偏见。我们还发现，来自平行语料库的监督通常优于字典一致性。

Multilingual BERT (mBERT) has shown reasonable capability for zero-shot cross-lingual transfer when fine-tuned on downstream tasks. Since mBERT is not pre-trained with explicit cross-lingual supervision, transfer performance can further be improved by aligning mBERT with cross-lingual signal. Prior work proposes several approaches to align contextualised embeddings. In this paper we analyse how different forms of cross-lingual supervision and various alignment methods influence the transfer capability of mBERT in zero-shot setting. Specifically, we compare parallel corpora vs. dictionary-based supervision and rotational vs. fine-tuning based alignment methods. We evaluate the performance of different alignment methodologies across eight languages on two tasks: Name Entity Recognition and Semantic Slot Filling. In addition, we propose a novel normalisation method which consistently improves the performance of rotation-based alignment including a notable 3% F1 improvement for distant and typologically dissimilar languages. Importantly we identify the biases of the alignment methods to the type of task and proximity to the transfer language. We also find that supervision from parallel corpus is generally superior to dictionary alignments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题