SEEC：跨边缘计算环境的语义矢量联合会

论文标题

SEEC：跨边缘计算环境的语义矢量联合会

SEEC: Semantic Vector Federation across Edge Computing Environments

论文作者

Witherspoon, Shalisha, Steuer, Dean, Bent, Graham, Desai, Nirmit

论文摘要

事实证明，语义矢量嵌入技术可用于学习跨多个域数据的语义表示。通过这种技术启用的关键应用程序是能够测量给定数据示例之间的语义相似性并找到与给定样本最相似的数据。最新的嵌入方法假定所有数据都可以在单个站点上获得。但是，在许多业务环境中，数据分布在多个边缘位置，由于各种约束，无法汇总。因此，最先进的嵌入方法的适用性仅限于自由共享的数据集，从而遗漏了具有敏感或关键任务数据的应用程序。本文通过提出称为\ emph {seec}的新型无监督算法来解决这一差距，用于学习和应用在各种分布式设置中的语义矢量嵌入。具体而言，对于多个边缘位置可以参与联合学习的场景，我们适应了最近提出的联合学习技术，用于语义向量嵌入。如果不可能进行联合学习，我们提出了新型的语义矢量翻译算法，以跨多个边缘位置启用语义查询，每个位置都有自己的语义矢量空间。自然语言以及图数据集的实验结果表明，这可能是一个有希望的新方向。

Semantic vector embedding techniques have proven useful in learning semantic representations of data across multiple domains. A key application enabled by such techniques is the ability to measure semantic similarity between given data samples and find data most similar to a given sample. State-of-the-art embedding approaches assume all data is available on a single site. However, in many business settings, data is distributed across multiple edge locations and cannot be aggregated due to a variety of constraints. Hence, the applicability of state-of-the-art embedding approaches is limited to freely shared datasets, leaving out applications with sensitive or mission-critical data. This paper addresses this gap by proposing novel unsupervised algorithms called \emph{SEEC} for learning and applying semantic vector embedding in a variety of distributed settings. Specifically, for scenarios where multiple edge locations can engage in joint learning, we adapt the recently proposed federated learning techniques for semantic vector embedding. Where joint learning is not possible, we propose novel semantic vector translation algorithms to enable semantic query across multiple edge locations, each with its own semantic vector-space. Experimental results on natural language as well as graph datasets show that this may be a promising new direction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题