深度强化学习（DRL）：无监督无线本地化的另一个观点

论文标题

深度强化学习（DRL）：无监督无线本地化的另一个观点

Deep Reinforcement Learning (DRL): Another Perspective for Unsupervised Wireless Localization

论文作者

Li, You, Hu, Xin, Zhuang, Yuan, Gao, Zhouzheng, Zhang, Peng, El-Sheimy, Naser

论文摘要

位置是空间化互联网（IOT）数据的关键。但是，使用低成本的IoT设备来进行强大的无监督本地化（即没有培训数据已知位置标签的训练数据）是一项挑战。因此，本文提出了基于无监督的无线定位方法的深度加固学习（DRL）。主要贡献如下。（1）本文提出了一种将连续无线定位过程建模为马尔可夫决策过程（MDP）的方法，并在DRL框架中进行处理。（2）减轻使用未标记的数据（例如，日常寿命众包数据）获得奖励的挑战，本文提出了一种奖励设定的机制，该机制从未标记的无线接收信号强度（RSS）中提取了可靠的地标数据。（3）为了简化使用DRL进行本地化时模型重新训练的要求，本文将RSS测量与代理位置一起构建DRL输入。通过使用牧场中的多个蓝牙5智能耳号的现场测试数据来测试所提出的方法。同时，实验验证过程反映了在无线定位中使用DRL的优点和挑战。

Location is key to spatialize internet-of-things (IoT) data. However, it is challenging to use low-cost IoT devices for robust unsupervised localization (i.e., localization without training data that have known location labels). Thus, this paper proposes a deep reinforcement learning (DRL) based unsupervised wireless-localization method. The main contributions are as follows. (1) This paper proposes an approach to model a continuous wireless-localization process as a Markov decision process (MDP) and process it within a DRL framework. (2) To alleviate the challenge of obtaining rewards when using unlabeled data (e.g., daily-life crowdsourced data), this paper presents a reward-setting mechanism, which extracts robust landmark data from unlabeled wireless received signal strengths (RSS). (3) To ease requirements for model re-training when using DRL for localization, this paper uses RSS measurements together with agent location to construct DRL inputs. The proposed method was tested by using field testing data from multiple Bluetooth 5 smart ear tags in a pasture. Meanwhile, the experimental verification process reflected the advantages and challenges for using DRL in wireless localization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题