论文标题
GRAVILON:一种新的梯度下降方法的应用机器学习
Gravilon: Applications of a New Gradient Descent Method to Machine Learning
论文作者
论文摘要
自牛顿方法成立以来,梯度下降算法已在无数应用中使用。近年来,神经网络应用数量的爆炸量重新激发了效率和准确性的标准梯度下降方法。这些方法修改了梯度更新参数值的效果。这些修改通常包含超参数:必须在程序开始时指定其值的其他变量。在下面,我们提供了一种称为Gravilon的新型梯度下降算法,该算法使用Hypersurface的几何形状来修改梯度方向上的步骤长度。使用神经网络,我们提供了有希望的实验结果,将Gravilon方法的准确性和效率与MNIST数字分类的常用梯度下降算法进行比较。
Gradient descent algorithms have been used in countless applications since the inception of Newton's method. The explosion in the number of applications of neural networks has re-energized efforts in recent years to improve the standard gradient descent method in both efficiency and accuracy. These methods modify the effect of the gradient in updating the values of the parameters. These modifications often incorporate hyperparameters: additional variables whose values must be specified at the outset of the program. We provide, below, a novel gradient descent algorithm, called Gravilon, that uses the geometry of the hypersurface to modify the length of the step in the direction of the gradient. Using neural networks, we provide promising experimental results comparing the accuracy and efficiency of the Gravilon method against commonly used gradient descent algorithms on MNIST digit classification.