论文标题
文本识别 - 现实世界数据以及在哪里找到它们
Text Recognition -- Real World Data and Where to Find Them
论文作者
论文摘要
我们提出了一种利用弱注释的图像以改善文本提取管道的方法。该方法使用任意的端到端文本识别系统来获取文本区域建议及其可能错误的转录。所提出的方法包括将不精确转录与弱注释匹配,并编辑距离指导的邻里搜索。它产生了场景文本的几乎没有错误的本地化实例,我们将其视为“伪地面真相”(PGT)。 我们将该方法应用于两个弱宣布的数据集。在不同的基准数据集(图像域)(图像域)中,对提取的PGT进行培训一致地提高了艺术识别模型状态的准确性,平均为3.7〜%,而在一个弱注释的数据集中,平均而言,24.5〜 \%。
We present a method for exploiting weakly annotated images to improve text extraction pipelines. The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions. The proposed method includes matching of imprecise transcription to weak annotations and edit distance guided neighbourhood search. It produces nearly error-free, localised instances of scene text, which we treat as "pseudo ground truth" (PGT). We apply the method to two weakly-annotated datasets. Training with the extracted PGT consistently improves the accuracy of a state of the art recognition model, by 3.7~\% on average, across different benchmark datasets (image domains) and 24.5~\% on one of the weakly annotated datasets.