论文标题
PAC学习一层隐形网络的算法和SQ下限
Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks
论文作者
论文摘要
我们研究了PAC学习在$ \ Mathbb {r}^d $上使用$ k $隐藏单位的PAC学习一层隐藏的网络在Gaussian Marginals下的$ \ Mathbb {r}^d $。对于正系数的情况,我们为$ k $升至$ \ tilde {o}(\ sqrt {\ log d})$的第一个多项式时间算法。以前,即使以$ k = 3 $,也不知道多项式时间算法。这回答了〜\ cite {kliv17}提出的一个开放问题。重要的是,我们的算法不需要关于权重矩阵等级的任何假设,其复杂性与其条件数字无关。在负面的一面,对于具有任意实际系数的PAC学习单隐层relu网络的更一般任务,我们证明了$ d^{ω(k)} $的统计查询下限。因此,我们在有效的学习性方面提供了两个类别之间的分离。我们的上限和下限是一般的,扩展到更广泛的激活功能系列。
We study the problem of PAC learning one-hidden-layer ReLU networks with $k$ hidden units on $\mathbb{R}^d$ under Gaussian marginals in the presence of additive label noise. For the case of positive coefficients, we give the first polynomial-time algorithm for this learning problem for $k$ up to $\tilde{O}(\sqrt{\log d})$. Previously, no polynomial time algorithm was known, even for $k=3$. This answers an open question posed by~\cite{Kliv17}. Importantly, our algorithm does not require any assumptions about the rank of the weight matrix and its complexity is independent of its condition number. On the negative side, for the more general task of PAC learning one-hidden-layer ReLU networks with arbitrary real coefficients, we prove a Statistical Query lower bound of $d^{Ω(k)}$. Thus, we provide a separation between the two classes in terms of efficient learnability. Our upper and lower bounds are general, extending to broader families of activation functions.