论文标题
关于文本分类的强大前缀调整
On Robust Prefix-Tuning for Text Classification
论文作者
论文摘要
最近,作为大规模预处理语言模型的一种参数有效的鉴定方法,前缀调整已越来越多。该方法可以将验证的模型保持固定,并且仅更新每个下游任务的前缀令牌参数。尽管轻巧且模块化,但前缀调整仍然缺乏对文本对抗攻击的强大性。但是,当前大多数开发的防御技术都需要辅助模型更新和存储,这不可避免地阻碍了前缀调整的模块化和低存储。在这项工作中,我们提出了一个强大的前缀调整框架,以保留前缀调整的效率和模块化。我们框架的核心思想是通过正确分类的培训数据作为其他前缀列出的标准来利用语言模型的层次激活。在测试阶段,为每个批次调整了一个额外的批处理前缀,并将其添加到原始前缀中以增强鲁棒性。三个文本分类基准的广泛实验表明,我们的框架可实质上提高了几个强大基准的鲁棒性,以针对不同类型的五种文字攻击,同时保持清洁文本的可比精度。我们还从最佳控制角度来解释了强大的前缀调整框架,并为未来的研究构成了多个方向。
Recently, prefix-tuning has gained increasing attention as a parameter-efficient finetuning method for large-scale pretrained language models. The method keeps the pretrained models fixed and only updates the prefix token parameters for each downstream task. Despite being lightweight and modular, prefix-tuning still lacks robustness to textual adversarial attacks. However, most currently developed defense techniques necessitate auxiliary model update and storage, which inevitably hamper the modularity and low storage of prefix-tuning. In this work, we propose a robust prefix-tuning framework that preserves the efficiency and modularity of prefix-tuning. The core idea of our framework is leveraging the layerwise activations of the language model by correctly-classified training data as the standard for additional prefix finetuning. During the test phase, an extra batch-level prefix is tuned for each batch and added to the original prefix for robustness enhancement. Extensive experiments on three text classification benchmarks show that our framework substantially improves robustness over several strong baselines against five textual attacks of different types while maintaining comparable accuracy on clean texts. We also interpret our robust prefix-tuning framework from the optimal control perspective and pose several directions for future research.