HITNET：实时立体声匹配的层次迭代瓷砖改进网络

论文标题

HITNET：实时立体声匹配的层次迭代瓷砖改进网络

HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching

论文作者

Tankovich, Vladimir, Häne, Christian, Zhang, Yinda, Kowdle, Adarsh, Fanello, Sean, Bouaziz, Sofien

论文摘要

本文介绍了Hitnet，这是一种新型的神经网络架构，用于实时立体声匹配。与许多在全部成本量并依赖3D卷积的新神经网络方法相反，我们的方法并没有明确地构建量，而是依赖于快速的多分辨率初始化步骤，可区分的2D几何传播和翘曲机制来推断异性假设。为了达到高度的准确性，我们的网络不仅在几何学上是关于差异的原因，而且还散发出倾斜的平面假设，可以更准确地执行几何翘曲和上升采样操作。我们的体系结构本质上是多分辨率，允许在不同层面上传播信息。多个实验证明了在最新方法所需的计算的一部分中提出的方法的有效性。在撰写本文时，HITNET在ETH3D网站上发布的所有指标上排名第1-3位，在两个视图立体声上，在米德尔伯里-V3上所有端到端的学习方法中，大多数指标在大多数指标中排名第一，在流行的Kitti 2012和2015年基准标准上排名第一，在出版的基准标准上，在出版的方法中，在出版的方法中比100ms中的100ms中排名第一。

This paper presents HITNet, a novel neural network architecture for real-time stereo matching. Contrary to many recent neural network approaches that operate on a full cost volume and rely on 3D convolutions, our approach does not explicitly build a volume and instead relies on a fast multi-resolution initialization step, differentiable 2D geometric propagation and warping mechanisms to infer disparity hypotheses. To achieve a high level of accuracy, our network not only geometrically reasons about disparities but also infers slanted plane hypotheses allowing to more accurately perform geometric warping and upsampling operations. Our architecture is inherently multi-resolution allowing the propagation of information across different levels. Multiple experiments prove the effectiveness of the proposed approach at a fraction of the computation required by state-of-the-art methods. At the time of writing, HITNet ranks 1st-3rd on all the metrics published on the ETH3D website for two view stereo, ranks 1st on most of the metrics among all the end-to-end learning approaches on Middlebury-v3, ranks 1st on the popular KITTI 2012 and 2015 benchmarks among the published methods faster than 100ms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题