论文标题
全局匹配,重叠的注意力进行光流估计
Global Matching with Overlapping Attention for Optical Flow Estimation
论文作者
论文摘要
光流估计是计算机视觉中的基本任务。使用深神经网络的最新直接回归方法可实现显着的性能改善。但是,它们没有明确捕获长期运动对应关系,因此无法有效处理大型动作。在本文中,受到传统匹配优化方法的启发,其中引入了匹配以在基于能量的优化之前处理大型位移,我们在直接回归之前引入了一个简单但有效的全局匹配步骤,并开发了基于学习的匹配优化框架,即GMFLOWNET。在GMFlownet中,通过在4D成本量上应用Argmax来有效地计算全球匹配。此外,为了提高匹配质量,我们提出了基于补丁的重叠注意力以提取大型上下文特征。广泛的实验表明,GMFlownet优于最受欢迎的仅优化方法,它通过很大的边距和实现标准基准的最新性能。由于匹配和重叠的关注,GMFlownet对无纹理区域和大动作的预测获得了重大改进。我们的代码可在https://github.com/xiaofeng94/gmflownet上公开提供。
Optical flow estimation is a fundamental task in computer vision. Recent direct-regression methods using deep neural networks achieve remarkable performance improvement. However, they do not explicitly capture long-term motion correspondences and thus cannot handle large motions effectively. In this paper, inspired by the traditional matching-optimization methods where matching is introduced to handle large displacements before energy-based optimizations, we introduce a simple but effective global matching step before the direct regression and develop a learning-based matching-optimization framework, namely GMFlowNet. In GMFlowNet, global matching is efficiently calculated by applying argmax on 4D cost volumes. Additionally, to improve the matching quality, we propose patch-based overlapping attention to extract large context features. Extensive experiments demonstrate that GMFlowNet outperforms RAFT, the most popular optimization-only method, by a large margin and achieves state-of-the-art performance on standard benchmarks. Thanks to the matching and overlapping attention, GMFlowNet obtains major improvements on the predictions for textureless regions and large motions. Our code is made publicly available at https://github.com/xiaofeng94/GMFlowNet