论文标题
视频框架插值通过广义可变形卷积
Video Frame Interpolation via Generalized Deformable Convolution
论文作者
论文摘要
视频框架插值旨在综合附近源帧的中间帧,同时保持空间和时间一致。现有的基于深度学习的视频框架插值方法可以大致分为两类:基于流的方法和基于内核的方法。由于过度简化的运动模型,基于流的方法的性能通常会受到流图估计的不准确性的危害,而基于内核方法的运动模型的性能往往受到内核形状的刚性的限制。为了解决这些限制性能问题,提出了一种名为广义变形卷积的新型机制,该机制可以以数据驱动的方式有效地学习运动信息,并在时空自由选择采样点。我们进一步根据此机制开发了一种新的视频框架插值方法。我们的广泛实验表明,新方法对最先进的方法有利,尤其是在处理复杂动作时。
Video frame interpolation aims at synthesizing intermediate frames from nearby source frames while maintaining spatial and temporal consistencies. The existing deep-learning-based video frame interpolation methods can be roughly divided into two categories: flow-based methods and kernel-based methods. The performance of flow-based methods is often jeopardized by the inaccuracy of flow map estimation due to oversimplified motion models, while that of kernel-based methods tends to be constrained by the rigidity of kernel shape. To address these performance-limiting issues, a novel mechanism named generalized deformable convolution is proposed, which can effectively learn motion information in a data-driven manner and freely select sampling points in space-time. We further develop a new video frame interpolation method based on this mechanism. Our extensive experiments demonstrate that the new method performs favorably against the state-of-the-art, especially when dealing with complex motions.