论文标题
在JPEG压缩的情况下培训CNN:多媒体取证与计算机视觉
Training CNNs in Presence of JPEG Compression: Multimedia Forensics vs Computer Vision
论文作者
论文摘要
在多个计算机视觉图像分类任务中,卷积神经网络(CNN)已被证明非常准确,这些任务需要视觉检查(例如,对象识别,面部检测等)。受这些令人惊讶的结果的激励,研究人员还开始使用CNN来应对图像法医问题(例如,相机模型识别,篡改检测等)。但是,在计算机视觉中,图像分类方法通常依赖于人眼易于检测到的视觉提示。相反,法医解决方案依赖于几乎看不见的痕迹,这些痕迹通常非常微妙,并在于分析的图像细节。因此,培训CNN解决法医任务需要一些特殊的护理,因为常见的处理操作(例如,重新采样,压缩等)可以极大地阻碍法医痕迹。在这项工作中,我们专注于JPEG考虑不同的计算机视觉和法医图像分类问题的CNN培训的影响。具体而言,我们考虑了JPEG压缩和JPEG网格的未对准的问题。我们表明,在生成训练数据集时有必要考虑这些效果,以便正确训练法医检测器不会失去概括能力,而几乎可以忽略这些效果来忽略计算机视觉任务。
Convolutional Neural Networks (CNNs) have proved very accurate in multiple computer vision image classification tasks that required visual inspection in the past (e.g., object recognition, face detection, etc.). Motivated by these astonishing results, researchers have also started using CNNs to cope with image forensic problems (e.g., camera model identification, tampering detection, etc.). However, in computer vision, image classification methods typically rely on visual cues easily detectable by human eyes. Conversely, forensic solutions rely on almost invisible traces that are often very subtle and lie in the fine details of the image under analysis. For this reason, training a CNN to solve a forensic task requires some special care, as common processing operations (e.g., resampling, compression, etc.) can strongly hinder forensic traces. In this work, we focus on the effect that JPEG has on CNN training considering different computer vision and forensic image classification problems. Specifically, we consider the issues that rise from JPEG compression and misalignment of the JPEG grid. We show that it is necessary to consider these effects when generating a training dataset in order to properly train a forensic detector not losing generalization capability, whereas it is almost possible to ignore these effects for computer vision tasks.