融合的文本识别剂和深层嵌入改善单词识别和检索

论文标题

融合的文本识别剂和深层嵌入改善单词识别和检索

Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval

论文作者

Bansal, Siddhant, Krishnan, Praveen, Jawahar, C. V.

论文摘要

从大型文档集合中识别和检索文本内容已成为文档图像分析社区的强大用例。通常，这个词是识别和检索的基本单位。在许多情况下，仅依赖文本识别仪（OCR）输出的系统不够强大，尤其是在识别率很差的情况下，例如历史文档或数字图书馆。另一种选择是基于单词斑点的方法，该方法基于单词的整体表示来检索/匹配单词。在本文中，我们将文本识别器的嘈杂输出与整个单词得出的深层嵌入表示形式融合在一起。在检索情况下，我们使用平均水平和最大融合来改善排名结果。我们验证了印地语文档集合的方法。我们将单词识别率提高1.4，并在地图中的11.13提高了检索。

Recognition and retrieval of textual content from the large document collections have been a powerful use case for the document image analysis community. Often the word is the basic unit for recognition as well as retrieval. Systems that rely only on the text recogniser (OCR) output are not robust enough in many situations, especially when the word recognition rates are poor, as in the case of historic documents or digital libraries. An alternative has been word spotting based methods that retrieve/match words based on a holistic representation of the word. In this paper, we fuse the noisy output of text recogniser with a deep embeddings representation derived out of the entire word. We use average and max fusion for improving the ranked results in the case of retrieval. We validate our methods on a collection of Hindi documents. We improve word recognition rate by 1.4 and retrieval by 11.13 in the mAP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题