论文标题
改善联邦学习的鲁棒性,以确保严重失衡的数据集
Improving the Robustness of Federated Learning for Severely Imbalanced Datasets
论文作者
论文摘要
随着数据洪水的不断增加和深层神经网络的成功,分布式深度学习的研究变得明显。实现这种分布式学习的两种常见方法是同步和异步的重量更新。在此手稿中,我们探索了非常简单的同步重量更新机制。已经看到,随着工人节点数量的越来越多,性能急剧下降。在极端不平衡的分类(例如离群检测)的背景下,已经研究了这种效果。在实际情况下,I.I.D.的假定条件可能无法实现。可能还会出现全球类不平衡情况,例如当地服务器会收到严重不平衡的数据,并且可能不会从少数群体中获取任何样本。在这种情况下,本地服务器中的DNN将完全偏向他们所获得的多数级别。这将对参数服务器的学习高度影响(实际上看不到任何数据)。已经观察到的是,在并行设置中,如果人们在参数服务器上使用现有的联合重量更新机制,则性能会随着工人节点数量的增加而大大降低。这主要是因为随着节点数量的越来越多,一个工人节点很有可能会获得很小的数据,要么不足以训练模型而不过度拟合或具有高度不平衡的类别分布。因此,本章通过介绍适应性成本敏感的势头的概念提出了解决这个问题的解决方法。可以看出,对于拟议的系统,性能没有最小的降解,而其他大多数方法都在此之前达到了最低效果。
With the ever increasing data deluge and the success of deep neural networks, the research of distributed deep learning has become pronounced. Two common approaches to achieve this distributed learning is synchronous and asynchronous weight update. In this manuscript, we have explored very simplistic synchronous weight update mechanisms. It has been seen that with an increasing number of worker nodes, the performance degrades drastically. This effect has been studied in the context of extreme imbalanced classification (e.g. outlier detection). In practical cases, the assumed conditions of i.i.d. may not be fulfilled. There may also arise global class imbalance situations like that of outlier detection where the local servers receive severely imbalanced data and may not get any samples from the minority class. In that case, the DNNs in the local servers will get completely biased towards the majority class that they receive. This would highly impact the learning at the parameter server (which practically does not see any data). It has been observed that in a parallel setting if one uses the existing federated weight update mechanisms at the parameter server, the performance degrades drastically with the increasing number of worker nodes. This is mainly because, with the increasing number of nodes, there is a high chance that one worker node gets a very small portion of the data, either not enough to train the model without overfitting or having a highly imbalanced class distribution. The chapter, hence, proposes a workaround to this problem by introducing the concept of adaptive cost-sensitive momentum averaging. It is seen that for the proposed system, there was no to minimal degradation in performance while most of the other methods hit their bottom performance before that.