视频DeBlurring时空可变形的注意力网络

论文标题

视频DeBlurring时空可变形的注意力网络

Spatio-Temporal Deformable Attention Network for Video Deblurring

论文作者

Zhang, Huicong, Xie, Haozhe, Yao, Hongxun

论文摘要

视频脱毛方法的关键成功因素是用相邻视频框架的尖锐像素来弥补中框的模糊像素。因此，主流方法基于估计的光流并融合对齐帧进行恢复。但是，这些方法有时会产生不令人满意的结果，因为它们很少考虑像素的模糊水平，这可能会引入视频帧中的模糊像素。实际上，并非视频框架中的所有像素都对脱毛是敏锐的和有益的。为了解决这个问题，我们提出了用于视频Delurring的时空变形注意网络（STDANET），该网络通过考虑视频帧的像素模糊级别来提取尖锐像素的信息。具体而言，stdanet是一个编码器 - 模块网络，结合了运动估计器和时空变形注意（STDA）模块，其中运动估计器预测了用作基本偏移的粗略光流，以在STDA模块中找到相应的尖锐像素。实验结果表明，所提出的Stdanet对GoPro，DVD和BSD数据集的最新方法表现出色。

The key success factor of the video deblurring methods is to compensate for the blurry pixels of the mid-frame with the sharp pixels of the adjacent video frames. Therefore, mainstream methods align the adjacent frames based on the estimated optical flows and fuse the alignment frames for restoration. However, these methods sometimes generate unsatisfactory results because they rarely consider the blur levels of pixels, which may introduce blurry pixels from video frames. Actually, not all the pixels in the video frames are sharp and beneficial for deblurring. To address this problem, we propose the spatio-temporal deformable attention network (STDANet) for video delurring, which extracts the information of sharp pixels by considering the pixel-wise blur levels of the video frames. Specifically, STDANet is an encoder-decoder network combined with the motion estimator and spatio-temporal deformable attention (STDA) module, where motion estimator predicts coarse optical flows that are used as base offsets to find the corresponding sharp pixels in STDA module. Experimental results indicate that the proposed STDANet performs favorably against state-of-the-art methods on the GoPro, DVD, and BSD datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题