论文标题
使用混合复发卷积学习框架的可变速率视频压缩
Variable Rate Video Compression using a Hybrid Recurrent Convolutional Learning Framework
论文作者
论文摘要
近年来,基于神经网络的图像压缩技术已经能够胜过传统的编解码器,并为开发基于学习的视频编解码器开辟了大门。但是,要利用视频中高的时间相关性,需要采用更复杂的体系结构。本文介绍了基于预测自动编码的概念的混合视频压缩框架,该概念使用预测网络建模连续视频框架之间的时间相关性,然后将其与渐进的编码器网络结合使用,以利用空间冗余。在论文中提出了可变速率块编码方案,该方案导致比率比率高度高。通过对这种混合体系结构的联合培训和微调,PredenCoder能够比MPEG-4编解码器获得显着改善,并且在H.264编解码器的H.264编解码器中获得了比特视频中的H.264编解码器,而对于非HD Videos的大多数比特比特率的结果是可比的。本文旨在证明如何利用神经体系结构与视频压缩域中高度优化的传统方法相提并论。
In recent years, neural network-based image compression techniques have been able to outperform traditional codecs and have opened the gates for the development of learning-based video codecs. However, to take advantage of the high temporal correlation in videos, more sophisticated architectures need to be employed. This paper presents PredEncoder, a hybrid video compression framework based on the concept of predictive auto-encoding that models the temporal correlations between consecutive video frames using a prediction network which is then combined with a progressive encoder network to exploit the spatial redundancies. A variable-rate block encoding scheme has been proposed in the paper that leads to remarkably high quality to bit-rate ratios. By joint training and fine-tuning of this hybrid architecture, PredEncoder has been able to gain significant improvement over the MPEG-4 codec and has achieved bit-rate savings over the H.264 codec in the low to medium bit-rate range for HD videos and comparable results over most bit-rates for non-HD videos. This paper serves to demonstrate how neural architectures can be leveraged to perform at par with the highly optimized traditional methodologies in the video compression domain.