视频框架插值通过广义可变形卷积

论文标题

视频框架插值通过广义可变形卷积

Video Frame Interpolation via Generalized Deformable Convolution

论文作者

Shi, Zhihao, Liu, Xiaohong, Shi, Kangdi, Dai, Linhui, Chen, Jun

论文摘要

视频框架插值旨在综合附近源帧的中间帧，同时保持空间和时间一致。现有的基于深度学习的视频框架插值方法可以大致分为两类：基于流的方法和基于内核的方法。由于过度简化的运动模型，基于流的方法的性能通常会受到流图估计的不准确性的危害，而基于内核方法的运动模型的性能往往受到内核形状的刚性的限制。为了解决这些限制性能问题，提出了一种名为广义变形卷积的新型机制，该机制可以以数据驱动的方式有效地学习运动信息，并在时空自由选择采样点。我们进一步根据此机制开发了一种新的视频框架插值方法。我们的广泛实验表明，新方法对最先进的方法有利，尤其是在处理复杂动作时。

Video frame interpolation aims at synthesizing intermediate frames from nearby source frames while maintaining spatial and temporal consistencies. The existing deep-learning-based video frame interpolation methods can be roughly divided into two categories: flow-based methods and kernel-based methods. The performance of flow-based methods is often jeopardized by the inaccuracy of flow map estimation due to oversimplified motion models, while that of kernel-based methods tends to be constrained by the rigidity of kernel shape. To address these performance-limiting issues, a novel mechanism named generalized deformable convolution is proposed, which can effectively learn motion information in a data-driven manner and freely select sampling points in space-time. We further develop a new video frame interpolation method based on this mechanism. Our extensive experiments demonstrate that the new method performs favorably against the state-of-the-art, especially when dealing with complex motions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题