在中文文本图像识别中深入研究CRNN模型

论文标题

在中文文本图像识别中深入研究CRNN模型

Digging Deeper into CRNN Model in Chinese Text Images Recognition

论文作者

Yu, Kunhong, Zhang, Yuze

论文摘要

自动文本图像识别是计算机视觉字段中的普遍应用。一种有效的方法是使用卷积复发性神经网络（CRNN）以端到端（End2end）方式完成任务。但是，众所周知，CRNN无法检测多排图像和类似Excel的图像。在本文中，我们提出了首先识别单行图像的一种替代方法，然后扩展相同的体系结构以识别使用建议多种方法的多行图像。为了识别包含框线的Excel样图像，我们提出了Deenoising卷积自动编码器（Line-Decae）以恢复框线。最后，我们提出了一种知识蒸馏（KD）方法，可以压缩原始的CRNN模型而不会丧失一般性。为了进行实验，我们首先从一本中国小说中生成人造样品，然后进行各种实验以验证我们的方法。

Automatic text image recognition is a prevalent application in computer vision field. One efficient way is use Convolutional Recurrent Neural Network(CRNN) to accomplish task in an end-to-end(End2End) fashion. However, CRNN notoriously fails to detect multi-row images and excel-like images. In this paper, we present one alternative to first recognize single-row images, then extend the same architecture to recognize multi-row images with proposed multiple methods. To recognize excel-like images containing box lines, we propose Line-Deep Denoising Convolutional AutoEncoder(Line-DDeCAE) to recover box lines. Finally, we present one Knowledge Distillation(KD) method to compress original CRNN model without loss of generality. To carry out experiments, we first generate artificial samples from one Chinese novel book, then conduct various experiments to verify our methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题