论文标题

隐性政策的有条件基于能量的模型:理论与实践之间的差距

Conditional Energy-Based Models for Implicit Policies: The Gap between Theory and Practice

论文作者

Ta, Duy-Nguyen, Cousineau, Eric, Zhao, Huihua, Feng, Siyuan

论文摘要

我们在理论和实践之间的差距中介绍了我们的发现,即使用条件能量的模型(EBM)作为行为联系政策的隐式表示。我们还阐明了以前的工作中的几个微妙的,可能令人困惑的细节,以帮助未来的研究。我们指出无条件和有条件的EBM之间的关键差异,并警告说,将一种训练方法盲目地应用于另一个,可能会导致不良的结果,这些结果无法很好地概括。最后,我们强调最大相互信息原理的重要性是在条件EBM中作为回归任务的隐式模型实现良好概括的必要条件。

We present our findings in the gap between theory and practice of using conditional energy-based models (EBM) as an implicit representation for behavior-cloned policies. We also clarify several subtle, and potentially confusing, details in previous work in an attempt to help future research in this area. We point out key differences between unconditional and conditional EBMs, and warn that blindly applying training methods for one to the other could lead to undesirable results that do not generalize well. Finally, we emphasize the importance of the Maximum Mutual Information principle as a necessary condition to achieve good generalization in conditional EBMs as implicit models for regression tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源