论文标题

对稀释的贝叶斯分类器的期望传播

Expectation propagation on the diluted Bayesian classifier

论文作者

Braunstein, Alfredo, Gueudré, Thomas, Pagnani, Andrea, Pieropan, Mirko

论文摘要

在许多数据驱动的科学和工程领域,高维数据集中的有效特征选择是非常重要的挑战。我们引入了一种统计力学启发的策略,该策略通过利用称为期望传播(EP)的计算方案(EP)来解决二进制分类中稀疏特征选择的问题。该算法用于训练从一组(可能部分错误地标记的)示例中,教师感知到具有稀释的连续权重的一组(可能部分错误的标签)示例,以训练连续的体重感知器。我们在各种条件下测试贝叶斯最佳设置中的方法,并根据消息传递和期望最大化近似推理方案将其与其他最先进的算法进行比较。总体而言,我们的模拟表明,就可变选择属性,估计准确性和计算复杂性而言,EP是一种强大且具有竞争力的算法,尤其是当学生感知到从相关的模式中训练学生时,可以防止其他迭代方法融合。此外,我们的数值测试表明,该算法能够在线学习先前参数的未知值,例如教师感知到的权重的稀释水平和错误标记的示例的分数,非常准确。这是通过一种简单的最大似然策略来实现的,该策略包括最大程度地减少与EP算法相关的自由能。

Efficient feature selection from high-dimensional datasets is a very important challenge in many data-driven fields of science and engineering. We introduce a statistical mechanics inspired strategy that addresses the problem of sparse feature selection in the context of binary classification by leveraging a computational scheme known as expectation propagation (EP). The algorithm is used in order to train a continuous-weights perceptron learning a classification rule from a set of (possibly partly mislabeled) examples provided by a teacher perceptron with diluted continuous weights. We test the method in the Bayes optimal setting under a variety of conditions and compare it to other state-of-the-art algorithms based on message passing and on expectation maximization approximate inference schemes. Overall, our simulations show that EP is a robust and competitive algorithm in terms of variable selection properties, estimation accuracy and computational complexity, especially when the student perceptron is trained from correlated patterns that prevent other iterative methods from converging. Furthermore, our numerical tests demonstrate that the algorithm is capable of learning online the unknown values of prior parameters, such as the dilution level of the weights of the teacher perceptron and the fraction of mislabeled examples, quite accurately. This is achieved by means of a simple maximum likelihood strategy that consists in minimizing the free energy associated with the EP algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源