稀疏特征半监督学习的广义线性联合训练框架

论文标题

稀疏特征半监督学习的广义线性联合训练框架

A generalized linear joint trained framework for semi-supervised learning of sparse features

论文作者

Laria, Juan C., Clemmensen, Line H., Ersbøll, Bjarne K.

论文摘要

弹性网络是使用最广泛的正则化算法之一，通常与受到惩罚最大可能性的监督广义线性模型估计问题有关。它的良好属性起源于$ \ ell_1 $和$ \ ell_2 $ norms的组合，该属性赋予了该方法的能力，可以选择变量考虑到它们之间的相关性。在过去的几年中，使用标记和未标记数据的半监督方法已成为统计研究中的重要组成部分。尽管有这种兴趣，但很少有研究研究了半监督的弹性网络扩展。本文在广义线性模型估计的背景下，为半监督特征学习稀疏特征的学习提供了一种新颖的解决方案：广义的半监督弹性网络（S2NET），它扩展了监督的弹性网络方法，具有涵盖的一般数学表述，但不限于回归和分类问题。我们在R中为S2NET开发了灵活而快速的实现，并使用真实和合成数据集说明了其优势。

The elastic-net is among the most widely used types of regularization algorithms, commonly associated with the problem of supervised generalized linear model estimation via penalized maximum likelihood. Its nice properties originate from a combination of $\ell_1$ and $\ell_2$ norms, which endow this method with the ability to select variables taking into account the correlations between them. In the last few years, semi-supervised approaches, that use both labeled and unlabeled data, have become an important component in the statistical research. Despite this interest, however, few researches have investigated semi-supervised elastic-net extensions. This paper introduces a novel solution for semi-supervised learning of sparse features in the context of generalized linear model estimation: the generalized semi-supervised elastic-net (s2net), which extends the supervised elastic-net method, with a general mathematical formulation that covers, but is not limited to, both regression and classification problems. We develop a flexible and fast implementation for s2net in R, and its advantages are illustrated using both real and synthetic data sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题