论文标题
用可逆的嵌入域特异性语言区分所有内容
Differentiate Everything with a Reversible Embeded Domain-Specific Language
论文作者
论文摘要
反向模式自动分化(AD)遇到的问题是,空间太多,无法追踪中间计算状态以进行反向传播。追溯状态的传统方法称为检查点,将中间状态存储到全局堆栈中,并通过堆栈POP或重新计算恢复状态。堆栈操作和重新计算的开销使得总体目的(不是基于张量的)广告发动机无法满足许多工业需求。我们建议使用反向计算来通过设计和实现可逆的编程EDSL来追踪状态,而不是检查点,在此过程中,可以在没有隐性堆栈操作的情况下进行双向执行程序。缺乏隐式堆栈操作使该程序与现有的编译器功能兼容,包括利用现有的优化通过并将代码编译为GPU内核。我们为稀疏矩阵操作和一些机器学习应用程序实施广告,以表明我们的框架具有最新的性能。
Reverse-mode automatic differentiation (AD) suffers from the issue of having too much space overhead to trace back intermediate computational states for back-propagation. The traditional method to trace back states is called checkpointing that stores intermediate states into a global stack and restore state through either stack pop or re-computing. The overhead of stack manipulations and re-computing makes the general purposed (not tensor-based) AD engines unable to meet many industrial needs. Instead of checkpointing, we propose to use reverse computing to trace back states by designing and implementing a reversible programming eDSL, where a program can be executed bi-directionally without implicit stack operations. The absence of implicit stack operations makes the program compatible with existing compiler features, including utilizing existing optimization passes and compiling the code as GPU kernels. We implement AD for sparse matrix operations and some machine learning applications to show that our framework has the state-of-the-art performance.