学习图像自适应3D查找表，用于实时增强高性能照片

论文标题

学习图像自适应3D查找表，用于实时增强高性能照片

Learning Image-adaptive 3D Lookup Tables for High Performance Photo Enhancement in Real-time

论文作者

Zeng, Hui, Cai, Jianrui, Li, Lida, Cao, Zisheng, Zhang, Lei

论文摘要

近年来，基于学习的方法越来越受欢迎，以增强照片的颜色和色调。但是，许多现有的照片增强方法要么带来不令人满意的结果，要么消耗过多的计算和内存资源，从而阻碍其应用于实际上在高分辨率图像（通常超过12百万像素）上。在本文中，我们学习图像自适应的3维查找表（3D LUTS），以实现快速稳固的照片增强功能。 3D LUTS广泛用于操纵照片的颜色和色调，但通常在相机成像管道或照片编辑工具中手动调整和固定。据我们最大的知识，我们首次提议使用成对或未配对的学习从带注释的数据中学习3D LUTS。更重要的是，我们学到的3D LUT是图像适应性，以增强灵活的照片。我们以端到端的方式同时学习多个基础3D LUT和小型卷积神经网络（CNN）。小型CNN在输入图像的下采样版本上工作，以预测与内容相关的权重，以将多个基础3D LUTS融合到图像自适应效果中，该图像适应性效果可有效地改变源图像的颜色和音调。我们的模型包含少于600K的参数，并且使用一个Titan RTX GPU处理4K分辨率的图像。尽管高效，但我们的模型还胜过最先进的照片增强方法，而在两个公开可用的基准数据集上，PSNR，SSIM和一个色差度量的差距很大。

Recent years have witnessed the increasing popularity of learning based methods to enhance the color and tone of photos. However, many existing photo enhancement methods either deliver unsatisfactory results or consume too much computational and memory resources, hindering their application to high-resolution images (usually with more than 12 megapixels) in practice. In this paper, we learn image-adaptive 3-dimensional lookup tables (3D LUTs) to achieve fast and robust photo enhancement. 3D LUTs are widely used for manipulating color and tone of photos, but they are usually manually tuned and fixed in camera imaging pipeline or photo editing tools. We, for the first time to our best knowledge, propose to learn 3D LUTs from annotated data using pairwise or unpaired learning. More importantly, our learned 3D LUT is image-adaptive for flexible photo enhancement. We learn multiple basis 3D LUTs and a small convolutional neural network (CNN) simultaneously in an end-to-end manner. The small CNN works on the down-sampled version of the input image to predict content-dependent weights to fuse the multiple basis 3D LUTs into an image-adaptive one, which is employed to transform the color and tone of source images efficiently. Our model contains less than 600K parameters and takes less than 2 ms to process an image of 4K resolution using one Titan RTX GPU. While being highly efficient, our model also outperforms the state-of-the-art photo enhancement methods by a large margin in terms of PSNR, SSIM and a color difference metric on two publically available benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题