改进开放域名角色意识对话的专用模型

论文标题

改进开放域名角色意识对话的专用模型

Improvement of a dedicated model for open domain persona-aware dialogue generation

论文作者

Han, Qiang

论文摘要

本文分析了近年来变压器体系结构的某些速度和性能改进方法，主要是在专用模型培训中的应用。这里研究的专用模型是指开放域名对话生成模型，数据集是多转的对话，单个输入序列的总长度不超过105个令牌。因此，本文未讨论变压器体系结构的结构和注意力机制的许多改进。实验的源代码已开源：https：//github.com/ghosthamlet/persona

This paper analyzes some speed and performance improvement methods of Transformer architecture in recent years, mainly its application in dedicated model training. The dedicated model studied here refers to the open domain persona-aware dialogue generation model, and the dataset is multi turn short dialogue, The total length of a single input sequence is no more than 105 tokens. Therefore, many improvements in the architecture and attention mechanism of transformer architecture for long sequence processing are not discussed in this paper. The source code of the experiments has been open sourced: https://github.com/ghosthamlet/persona

下载PDF全文

下载文献需遵守相关版权规定

论文标题