通过匹配的增强分布提高了预训练网络的性能

论文标题

通过匹配的增强分布提高了预训练网络的性能

Enhanced Performance of Pre-Trained Networks by Matched Augmentation Distributions

论文作者

Ahmad, Touqeer, Jafarzadeh, Mohsen, Dhamija, Akshay Raj, Rabinowitz, Ryan, Cruz, Steve, Li, Chunchun, Boult, Terrance E.

论文摘要

训练和测试之间存在分布差异，以图像被馈送到现代CNN。最近的工作试图通过微调或重新训练网络以不同的分辨率来弥合这一差距。但是，重新培训网络很少便宜，而且并不总是可行的。为此，我们提出了一个简单的解决方案，以解决火车测试的分配变化并增强预训练模型的性能 - 通常将其作为包装，带有深度学习平台\ EG，Pytorch。具体而言，我们证明，在图像的中心作物上进行推断并不总是最好的，因为可能会裁剪重要的歧视性信息。取而代之的是，我们建议将多种随机作物的结果组合成测试图像。这不仅与火车时间的增加相匹配，而且还提供了输入图像的全部覆盖范围。我们通过在不同级别的平均\ ie，深度特征级别，logit级别和SoftMax级别的平均值来探索随机作物的表示形式。我们证明，对于现代深层网络的各个家庭，与使用单个中央作物相比，这种平均结果可以提高验证精度。 SoftMax的平均值可为各种预训练的网络提供最佳性能，而无需进行任何重新训练或微调。在带有批处理处理的现代GPU上，本文对预训练网络的推断方法基本上是免费的，因为批次中的所有图像都可以立即处理。

There exists a distribution discrepancy between training and testing, in the way images are fed to modern CNNs. Recent work tried to bridge this gap either by fine-tuning or re-training the network at different resolutions. However re-training a network is rarely cheap and not always viable. To this end, we propose a simple solution to address the train-test distributional shift and enhance the performance of pre-trained models -- which commonly ship as a package with deep learning platforms \eg, PyTorch. Specifically, we demonstrate that running inference on the center crop of an image is not always the best as important discriminatory information may be cropped-off. Instead we propose to combine results for multiple random crops for a test image. This not only matches the train time augmentation but also provides the full coverage of the input image. We explore combining representation of random crops through averaging at different levels \ie, deep feature level, logit level, and softmax level. We demonstrate that, for various families of modern deep networks, such averaging results in better validation accuracy compared to using a single central crop per image. The softmax averaging results in the best performance for various pre-trained networks without requiring any re-training or fine-tuning whatsoever. On modern GPUs with batch processing, the paper's approach to inference of pre-trained networks, is essentially free as all images in a batch can all be processed at once.

下载PDF全文

下载文献需遵守相关版权规定

论文标题