论文标题
(本地)私人组合半伴侣
(Locally) Differentially Private Combinatorial Semi-Bandits
论文作者
论文摘要
在本文中,我们研究了组合半伴侣(CSB),该组合是在差异隐私(DP)和更强的本地差异隐私(LDP)设置下的经典多臂土匪(MAB)的扩展。由于服务器从CSB中的用户收到更多信息,因此通常会导致对数据维度的额外依赖性,这是保留隐私学习的臭名昭著的副作用。但是,对于CSB,在两个常见的平滑度假设下\ cite {kveton2015tight,chen2016combinatorial},我们表明可以删除此副作用。详细说明,对于$ b _ {\ infty} $ - 在$ \ varepsilon $ -ldp或$ \ varepsilon $ -DP下,有限的平滑CSB,我们证明了最佳遗憾的限制为$θ(\ frac {mb^2 _ {\ infty} {\ infty} \ ln t} \ ln t} $^2} $ $ \tildeθ(\ frac {mb^2 _ {\ infty} \ ln t} {δε})$,其中$ t $是时间段,$δ$是奖励的差距,$ m $是$ m $是基本武器的数量,是通过提出小型基础武器的数量。对于$ \ \ \ varepsilon $ -DP,对于$ b_1 $ bunded的平滑CSB,我们还证明了最佳的遗憾绑定为$ \tildeθ(\ frac {mkb^2_1 \ ln t} {δε} {Δε} {δε} {Δim} $ complove and $ k $ a $ k $均为每个床上的最大数量。上述所有结果几乎匹配相应的非私人最佳速率,这意味着在公共设置上没有(本地)对私有CSB的额外价格。
In this paper, we study Combinatorial Semi-Bandits (CSB) that is an extension of classic Multi-Armed Bandits (MAB) under Differential Privacy (DP) and stronger Local Differential Privacy (LDP) setting. Since the server receives more information from users in CSB, it usually causes additional dependence on the dimension of data, which is a notorious side-effect for privacy preserving learning. However for CSB under two common smoothness assumptions \cite{kveton2015tight,chen2016combinatorial}, we show it is possible to remove this side-effect. In detail, for $B_{\infty}$-bounded smooth CSB under either $\varepsilon$-LDP or $\varepsilon$-DP, we prove the optimal regret bound is $Θ(\frac{mB^2_{\infty}\ln T } {Δε^2})$ or $\tildeΘ(\frac{mB^2_{\infty}\ln T} { Δε})$ respectively, where $T$ is time period, $Δ$ is the gap of rewards and $m$ is the number of base arms, by proposing novel algorithms and matching lower bounds. For $B_1$-bounded smooth CSB under $\varepsilon$-DP, we also prove the optimal regret bound is $\tildeΘ(\frac{mKB^2_1\ln T} {Δε})$ with both upper bound and lower bound, where $K$ is the maximum number of feedback in each round. All above results nearly match corresponding non-private optimal rates, which imply there is no additional price for (locally) differentially private CSB in above common settings.