论文标题
SEQ-UPS:半监督文本识别
Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition
论文作者
论文摘要
本文着眼于针对基于图像的文本识别的半监督学习(SSL)。最受欢迎的SSL方法之一是伪标记(PL)。 PL方法将标签分配给未标记的数据,然后再通过标记和伪标记数据的组合重新训练模型。然而,由于包括校准较差的模型产生的错误高置信度伪标签,因此PL方法因噪声而严重降解,并且容易与嘈杂的标签过度贴合,因此,基于阈值的选择无效。此外,假设空间的组合复杂性以及由于多个不正确的自回归步骤引起的误差积累,对序列模型的伪标记构成了挑战。为此,我们提出了一个伪标签的生成和半监视文本识别的基于不确定性的数据选择框架。我们首先使用横梁搜索推论来产生高度可能的假设,以将伪标记分配给未标记的示例。然后,我们采用了通过应用辍学来取样的模型集合,以获得与预测相关的不确定性的稳健估计,考虑了字符级别和单词级预测分布以选择优质的伪标记。在几个基准笔迹和场景文本数据集上进行的广泛实验表明,我们的方法优于基线方法和先前的最新半监督文本识别方法。
This paper looks at semi-supervised learning (SSL) for image-based text recognition. One of the most popular SSL approaches is pseudo-labeling (PL). PL approaches assign labels to unlabeled data before re-training the model with a combination of labeled and pseudo-labeled data. However, PL methods are severely degraded by noise and are prone to over-fitting to noisy labels, due to the inclusion of erroneous high confidence pseudo-labels generated from poorly calibrated models, thus, rendering threshold-based selection ineffective. Moreover, the combinatorial complexity of the hypothesis space and the error accumulation due to multiple incorrect autoregressive steps posit pseudo-labeling challenging for sequence models. To this end, we propose a pseudo-label generation and an uncertainty-based data selection framework for semi-supervised text recognition. We first use Beam-Search inference to yield highly probable hypotheses to assign pseudo-labels to the unlabelled examples. Then we adopt an ensemble of models, sampled by applying dropout, to obtain a robust estimate of the uncertainty associated with the prediction, considering both the character-level and word-level predictive distribution to select good quality pseudo-labels. Extensive experiments on several benchmark handwriting and scene-text datasets show that our method outperforms the baseline approaches and the previous state-of-the-art semi-supervised text-recognition methods.