保存隐私内部的姿势监控：融合和重建研究

论文标题

保存隐私内部的姿势监控：融合和重建研究

Privacy-Preserving In-Bed Pose Monitoring: A Fusion and Reconstruction Study

论文作者

Dayarathna, Thisun, Muthukumarana, Thamidu, Rathnayaka, Yasiru, Denman, Simon, de Silva, Chathura, Pemasiri, Akila, Ahmedt-Aristizabal, David

论文摘要

最近，由于与广泛的医疗保健应用相关，人体内姿势估计引起了研究人员的兴趣。与人类姿势估计的一般问题相比，内部的姿势估计有几个固有的挑战，最突出的是频繁且严重的遮挡，由床上用品引起。在本文中，我们探讨了来自多种非视觉和隐私模式的有效使用图像，例如深度，长波红外（LWIR）和压力图，以在两种设置中估算床内姿势估算的任务。首先，我们探索来自不同成像方式的信息的有效融合，以更好地估计姿势。其次，我们提出了一个框架，该框架可以估计可见图像不可用时可以估算床内姿势估计，并演示融合方法适用于仅可用LWIR图像的方案。我们分析并证明了来自多种模式的融合功能的效果。为此，我们考虑了四种不同的技术：1）添加，2）串联，3）通过学习的模态融合和4）端到端完全可训练的方法；具有最先进的姿势估计模型。我们还评估了从隐私保护模式中重建富含数据的模态（即可见的模态）的效果，该模式具有数据稀缺性（即长波长红外），以进行人体内姿势估计。为了重建，我们使用有条件的生成对抗网络。我们跨框架的不同设计决策进行了消融研究。这包括使用不同的融合技术和不同模型参数的选择具有不同粒度级别的特征。通过广泛的评估，我们证明了我们的方法与最先进的方法相比产生的结果或更好的结果。

Recently, in-bed human pose estimation has attracted the interest of researchers due to its relevance to a wide range of healthcare applications. Compared to the general problem of human pose estimation, in-bed pose estimation has several inherent challenges, the most prominent being frequent and severe occlusions caused by bedding. In this paper we explore the effective use of images from multiple non-visual and privacy-preserving modalities such as depth, long-wave infrared (LWIR) and pressure maps for the task of in-bed pose estimation in two settings. First, we explore the effective fusion of information from different imaging modalities for better pose estimation. Secondly, we propose a framework that can estimate in-bed pose estimation when visible images are unavailable, and demonstrate the applicability of fusion methods to scenarios where only LWIR images are available. We analyze and demonstrate the effect of fusing features from multiple modalities. For this purpose, we consider four different techniques: 1) Addition, 2) Concatenation, 3) Fusion via learned modal weights, and 4) End-to-end fully trainable approach; with a state-of-the-art pose estimation model. We also evaluate the effect of reconstructing a data-rich modality (i.e., visible modality) from a privacy-preserving modality with data scarcity (i.e., long-wavelength infrared) for in-bed human pose estimation. For reconstruction, we use a conditional generative adversarial network. We conduct ablative studies across different design decisions of our framework. This includes selecting features with different levels of granularity, using different fusion techniques, and varying model parameters. Through extensive evaluations, we demonstrate that our method produces on par or better results compared to the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题