从匪徒反馈中选择机器翻译系统

论文标题

从匪徒反馈中选择机器翻译系统

Machine Translation System Selection from Bandit Feedback

论文作者

Naradowsky, Jason, Zhang, Xuan, Duh, Kevin

论文摘要

现实世界中的适应机翻译系统是一个困难的问题。与离线培训相反，用户无法提供通常用于改进系统的细粒反馈（例如正确的翻译）的类型。此外，不同的用户有不同的翻译需求，即使是单个用户的需求也可能会随着时间而变化。在这项工作中，我们采用了另一种方法，将适应问题视为选择之一。我们没有适应单个系统，而是使用不同的架构，数据集和优化方法训练许多翻译系统。在模拟用户反馈中使用Bandit学习技术，我们学习一项策略，以选择用于特定翻译任务的系统。我们表明，我们的方法可以（1）快速适应转换任务中的域变化，（2）在使用上下文匪徒策略时，在混合域转换任务中胜过混合域转换任务中的单个最佳系统。

Adapting machine translation systems in the real world is a difficult problem. In contrast to offline training, users cannot provide the type of fine-grained feedback (such as correct translations) typically used for improving the system. Moreover, different users have different translation needs, and even a single user's needs may change over time. In this work we take a different approach, treating the problem of adaptation as one of selection. Instead of adapting a single system, we train many translation systems using different architectures, datasets, and optimization methods. Using bandit learning techniques on simulated user feedback, we learn a policy to choose which system to use for a particular translation task. We show that our approach can (1) quickly adapt to address domain changes in translation tasks, (2) outperform the single best system in mixed-domain translation tasks, and (3) make effective instance-specific decisions when using contextual bandit strategies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题