口号：用于任意长度和超庞布尔文字的手写样式合成

论文标题

口号：用于任意长度和超庞布尔文字的手写样式合成

SLOGAN: Handwriting Style Synthesis for Arbitrary-Length and Out-of-Vocabulary Text

论文作者

Luo, Canjie, Zhu, Yuanzhi, Jin, Lianwen, Li, Zhe, Peng, Dezhi

论文摘要

迫切需要大量的标记数据来培训强大的文本识别器。但是，收集各种样式的手写数据以及巨大的词典，非常昂贵。尽管数据综合是缓解数据饥饿的一种有希望的方法，但手写合成的两个关键问题，即样式表示和嵌入内容，但仍未解决。为此，我们提出了一种新的方法，可以根据生成对抗性网络（GAN）（称为口号）合成针对任意长度和不播放量的参数化和可控的手写样式。具体来说，我们建议使用样式库来将特定的手写样式作为潜在向量参数化，该载体是将发电机输入的样式先验，以实现相应的手写样式。样式银行的培训仅需要作者标识源图像，而不是属性注释。此外，我们通过提供易于获得的印刷样式图像来嵌入文本内容，以便可以通过更改输入打印图像来灵活地实现内容的多样性。最后，发电机以双重判别器为指导，以处理以分离字符和一系列草书连接的方式来处理的手写特征。我们的方法可以合成未包含在训练词汇中的单词以及各种新样式。广泛的实验表明，可以使用我们的方法合成具有出色风格多样性和丰富词汇的高质量文本图像，从而增强识别器的鲁棒性。

Large amounts of labeled data are urgently required for the training of robust text recognizers. However, collecting handwriting data of diverse styles, along with an immense lexicon, is considerably expensive. Although data synthesis is a promising way to relieve data hunger, two key issues of handwriting synthesis, namely, style representation and content embedding, remain unsolved. To this end, we propose a novel method that can synthesize parameterized and controllable handwriting Styles for arbitrary-Length and Out-of-vocabulary text based on a Generative Adversarial Network (GAN), termed SLOGAN. Specifically, we propose a style bank to parameterize the specific handwriting styles as latent vectors, which are input to a generator as style priors to achieve the corresponding handwritten styles. The training of the style bank requires only the writer identification of the source images, rather than attribute annotations. Moreover, we embed the text content by providing an easily obtainable printed style image, so that the diversity of the content can be flexibly achieved by changing the input printed image. Finally, the generator is guided by dual discriminators to handle both the handwriting characteristics that appear as separated characters and in a series of cursive joins. Our method can synthesize words that are not included in the training vocabulary and with various new styles. Extensive experiments have shown that high-quality text images with great style diversity and rich vocabulary can be synthesized using our method, thereby enhancing the robustness of the recognizer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题