低焦酸视频会议的混合深度动画编解码器

论文标题

低焦酸视频会议的混合深度动画编解码器

A Hybrid Deep Animation Codec for Low-bitrate Video Conferencing

论文作者

Konuko, Goluck, Lathuilière, Stéphane, Valenzise, Giuseppe

论文摘要

深层生成模型，尤其是面部动画方案，可以在视频会议应用程序中使用，以通过一组稀疏的关键点有效地压缩视频，而无需传输密集的运动向量。尽管这些方案在低比特率的传统视频编解码器上带来了显着的编码增长，但当可用带宽增加时，它们的性能很快就会饱和。在本文中，我们提出了一个分层的混合编码方案来克服这一限制。具体来说，我们通过添加由通过常规视频编解码器（例如HEVC）获得的视频中非常低的比特率版本组成的辅助流来扩展基于面部动画的编解码器。动画和辅助视频通过新颖的融合模块组合。我们的结果表明，在大型视频会议序列数据集中，平均BD速率增长超过-30％，仅扩大了面部动画编解码器的比特率的运行范围

Deep generative models, and particularly facial animation schemes, can be used in video conferencing applications to efficiently compress a video through a sparse set of keypoints, without the need to transmit dense motion vectors. While these schemes bring significant coding gains over conventional video codecs at low bitrates, their performance saturates quickly when the available bandwidth increases. In this paper, we propose a layered, hybrid coding scheme to overcome this limitation. Specifically, we extend a codec based on facial animation by adding an auxiliary stream consisting of a very low bitrate version of the video, obtained through a conventional video codec (e.g., HEVC). The animated and auxiliary videos are combined through a novel fusion module. Our results show consistent average BD-Rate gains in excess of -30% on a large dataset of video conferencing sequences, extending the operational range of bitrates of a facial animation codec alone

下载PDF全文

下载文献需遵守相关版权规定

论文标题