论文标题

带语法的树回声状态自动编码器

Tree Echo State Autoencoders with Grammars

论文作者

Paassen, Benjamin, Koprinska, Irena, Yacef, Kalina

论文摘要

树数据以多种形式发生,例如计算机程序,化学分子或自然语言。不幸的是,树木的非矢量性和离散性质使得用树木形成的输出构建功能,使任务复杂化,例如优化或时间序列预测,它具有挑战性。自动编码器通过将树映射到矢量潜在空间来应对这一挑战,在该空间中,任务更易于求解,然后将解决方案映射回树结构。但是,现有的树木数据自动编码方法无法考虑树域的特定语法结构并依靠深度学习,因此需要大型培训数据集和较长的培训时间。在本文中,我们提出了Tree Echo状态自动编码器(TES-AE),该自动编码器由树语法引导,可以通过储层计算在几秒钟内进行训练。在我们在三个数据集上的评估中,我们证明我们所提出的方法不仅比最先进的深度学习自动编码方法(D-VAE)快得多,而且如果给出很少的数据和时间,自动编码错误也更少。

Tree data occurs in many forms, such as computer programs, chemical molecules, or natural language. Unfortunately, the non-vectorial and discrete nature of trees makes it challenging to construct functions with tree-formed output, complicating tasks such as optimization or time series prediction. Autoencoders address this challenge by mapping trees to a vectorial latent space, where tasks are easier to solve, and then mapping the solution back to a tree structure. However, existing autoencoding approaches for tree data fail to take the specific grammatical structure of tree domains into account and rely on deep learning, thus requiring large training datasets and long training times. In this paper, we propose tree echo state autoencoders (TES-AE), which are guided by a tree grammar and can be trained within seconds by virtue of reservoir computing. In our evaluation on three datasets, we demonstrate that our proposed approach is not only much faster than a state-of-the-art deep learning autoencoding approach (D-VAE) but also has less autoencoding error if little data and time is given.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源