论文标题
随机梯度下降,具有分类特征的梯度估计器
Stochastic gradient descent with gradient estimator for categorical features
论文作者
论文摘要
分类数据存在于健康或供应链等关键领域,此数据需要特定的治疗方法。为了将最新的机器学习模型应用于此类数据,需要编码。为了构建可解释的模型,单次编码仍然是一个很好的解决方案,但是这样的编码会创建稀疏的数据。梯度估计器不适合稀疏数据:梯度主要视为零,而它并不总是存在,因此引入了新型的梯度估计器。我们显示了该估计值在理论上最小化的内容,并在具有多个模型体系结构的不同数据集上显示了其效率。在相似的设置下,该新的估计器的性能比共同估计器更好。匿名后,现实世界零售数据集也会发布。总体而言,本文的目的是彻底考虑分类数据,并适应这些关键功能。
Categorical data are present in key areas such as health or supply chain, and this data require specific treatment. In order to apply recent machine learning models on such data, encoding is needed. In order to build interpretable models, one-hot encoding is still a very good solution, but such encoding creates sparse data. Gradient estimators are not suited for sparse data: the gradient is mainly considered as zero while it simply does not always exists, thus a novel gradient estimator is introduced. We show what this estimator minimizes in theory and show its efficiency on different datasets with multiple model architectures. This new estimator performs better than common estimators under similar settings. A real world retail dataset is also released after anonymization. Overall, the aim of this paper is to thoroughly consider categorical data and adapt models and optimizers to these key features.