论文标题
可恶的模因挑战下一步
The Hateful Memes Challenge Next Move
论文作者
论文摘要
最先进的图像和文本分类模型(例如卷积神经网络和变形金刚)长期以来能够以接近或超过人类准确性的精度或超过人类的准确性来满足其各自的单峰推理。但是,当困难的示例(例如良性混杂因素)将嵌入文本嵌入的图像(例如可恨模因)纳入数据集时,很难使用单峰推理进行分类。除了从仇恨的模因挑战中的获胜团队的框架基于Facebook AI的仇恨模因数据集外,我们还试图产生更多标记的模因。为了增加标记的模因的数量,我们使用伪标签探索半监督学习的学习,以从Memotion DataSet 7K收集的新引入,未标记的模因。我们发现,无标记的数据所需的半监督学习任务需要人为干预和过滤,并且增加有限的新数据不会产生额外的分类性能。
State-of-the-art image and text classification models, such as Convolutional Neural Networks and Transformers, have long been able to classify their respective unimodal reasoning satisfactorily with accuracy close to or exceeding human accuracy. However, images embedded with text, such as hateful memes, are hard to classify using unimodal reasoning when difficult examples, such as benign confounders, are incorporated into the data set. We attempt to generate more labeled memes in addition to the Hateful Memes data set from Facebook AI, based on the framework of a winning team from the Hateful Meme Challenge. To increase the number of labeled memes, we explore semi-supervised learning using pseudo-labels for newly introduced, unlabeled memes gathered from the Memotion Dataset 7K. We find that the semi-supervised learning task on unlabeled data required human intervention and filtering and that adding a limited amount of new data yields no extra classification performance.