在私人随机凸的差异上优化和重尾数据

论文标题

在私人随机凸的差异上优化和重尾数据

On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data

论文作者

Wang, Di, Xiao, Hanshen, Devadas, Srini, Xu, Jinhui

论文摘要

在本文中，我们考虑了在重尾数据上设计差异化私有（DP）算法（SCO）的问题。此类数据的不规则性违反了几乎所有现有的DP-SCO和DP-MER方法中使用的一些关键假设，从而导致未能提供DP保证。为了更好地了解这种类型的挑战，我们在本文中提供了对DP-SCO在各种情况下进行的全面研究。首先，我们考虑损失函数强烈凸出和光滑的情况。在这种情况下，我们提出了一种基于样本和聚集框架的方法，该方法的人口过剩风险为$ \ tilde {o}（\ frac {d^3} {d^3} {nε^4}）$（省略其他因素），其中$ n $是样本大小，$ d $是数据的尺寸。然后，我们表明，有了一些关于损失功能的其他假设，可以将\ textit {预期}过量的总体风险降低至$ \ tilde {o}（\ frac {d^2} {nε^2}）$。为了提高这些额外的条件，我们还提供了基于梯度平滑和基于修剪的方案，以实现$ \ tilde {o}（\ frac {d^2} {d^2} {nε^2}）$和$ \ tilde {o}（\ frac {d^\ frac {2} {3}} {（nε^2）^\ frac {1} {1} {3}}）$，分别为convex和一般convex损失功能，分别具有很高的概率}。实验表明，我们的算法可以有效地应对数据不规则性引起的挑战。

In this paper, we consider the problem of designing Differentially Private (DP) algorithms for Stochastic Convex Optimization (SCO) on heavy-tailed data. The irregularity of such data violates some key assumptions used in almost all existing DP-SCO and DP-ERM methods, resulting in failure to provide the DP guarantees. To better understand this type of challenges, we provide in this paper a comprehensive study of DP-SCO under various settings. First, we consider the case where the loss function is strongly convex and smooth. For this case, we propose a method based on the sample-and-aggregate framework, which has an excess population risk of $\tilde{O}(\frac{d^3}{nε^4})$ (after omitting other factors), where $n$ is the sample size and $d$ is the dimensionality of the data. Then, we show that with some additional assumptions on the loss functions, it is possible to reduce the \textit{expected} excess population risk to $\tilde{O}(\frac{ d^2}{ nε^2 })$. To lift these additional conditions, we also provide a gradient smoothing and trimming based scheme to achieve excess population risks of $\tilde{O}(\frac{ d^2}{nε^2})$ and $\tilde{O}(\frac{d^\frac{2}{3}}{(nε^2)^\frac{1}{3}})$ for strongly convex and general convex loss functions, respectively, \textit{with high probability}. Experiments suggest that our algorithms can effectively deal with the challenges caused by data irregularity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题