pregan：姿势随机和弱配对图像样式翻译的估计

论文标题

pregan：姿势随机和弱配对图像样式翻译的估计

PREGAN: Pose Randomization and Estimation for Weakly Paired Image Style Translation

论文作者

Chen, Zexi, Guo, Jiaxin, Xu, Xuecheng, Wang, Yunkai, Wang, Yue, Xiong, Rong

论文摘要

在没有数据注释的不同条件下，利用训练有素的模型对于机器人应用很有吸引力。为了实现这一目标，一类方法是将图像样式从另一个环境转换为训练模型的环境。在本文中，我们为样式翻译提出了一个弱规模的设置，其中两个图像中的内容与姿势中的错误对齐。这些图像可以由不同的传感器在具有重叠区域的不同条件下获取，例如从阳光明媚的日子或有雾的夜晚，带有激光镜头或立体声摄像机。我们认为此设置对以下方式更为实用：（i）比配对数据更容易标记；（ii）比未配对数据更好的解释性和细节检索。为了跨这样的图像进行翻译，我们建议Pregan通过以随机姿势有意转换两个图像来训练样式翻译器，并通过可区分的不可鉴定的姿势估计器估算给定的随机姿势，鉴于样式越好，估计结果越好。这种对抗性培训强制执行网络学习样式翻译，避免与其他变体纠缠。最后，对Pregan在模拟和现实世界中收集的数据进行验证，以显示有效性。在下游任务，分类，道路细分，对象检测和功能匹配方面的结果显示了其对实际应用的潜力。 https://github.com/wrld/progan

Utilizing the trained model under different conditions without data annotation is attractive for robot applications. Towards this goal, one class of methods is to translate the image style from another environment to the one on which models are trained. In this paper, we propose a weakly-paired setting for the style translation, where the content in the two images is aligned with errors in poses. These images could be acquired by different sensors in different conditions that share an overlapping region, e.g. with LiDAR or stereo cameras, from sunny days or foggy nights. We consider this setting to be more practical with: (i) easier labeling than the paired data; (ii) better interpretability and detail retrieval than the unpaired data. To translate across such images, we propose PREGAN to train a style translator by intentionally transforming the two images with a random pose, and to estimate the given random pose by differentiable non-trainable pose estimator given that the more aligned in style, the better the estimated result is. Such adversarial training enforces the network to learn the style translation, avoiding being entangled with other variations. Finally, PREGAN is validated on both simulated and real-world collected data to show the effectiveness. Results on down-stream tasks, classification, road segmentation, object detection, and feature matching show its potential for real applications. https://github.com/wrld/PRoGAN

下载PDF全文

下载文献需遵守相关版权规定

论文标题