灵活触发鹰队过程建模的内核

论文标题

灵活触发鹰队过程建模的内核

Flexible Triggering Kernels for Hawkes Process Modeling

论文作者

Isik, Yamac Alican, Davis, Connor, Chapfuwa, Paidamoyo, Henao, Ricardo

论文摘要

最近提出的用于建模霍克斯工艺的编码器解码器结构使用以变压器为灵感的体系结构，该结构通过嵌入和自我注意解机制编码事件的历史。这些模型比基于RNN的同行提供了更好的预测和拟合度。但是，它们通常需要高计算和内存复杂性要求，有时无法充分捕获基础过程的触发功能。如此动机，我们通过用触发观察到的数据的内核代替复合（多层）注意结构来对历史事件序列进行有效而普遍的编码。注意点过程的触发内核与注意力分数之间的相似性，我们使用触发内核来替换用于构建历史记录表示的权重。我们对触发功能的估计设备配备了一种sigmoid门控机制，该机制捕获了局部触发效果，这些效果原本是通过标准衰减 - 超时核的挑战。此外，将事件类型表示和时间嵌入作为输入，该模型学习了给定的事件类型的基础触发类型的内核参数。我们介绍了有关竞争模型广泛使用的合成和真实数据集的实验，同时包括COVID-19数据集，以说明可用的纵向协变量的情况。结果表明，所提出的模型优于现有方法，同时在计算复杂性方面更有效，并通过直接应用新引入的内核产生可解释的结果。

Recently proposed encoder-decoder structures for modeling Hawkes processes use transformer-inspired architectures, which encode the history of events via embeddings and self-attention mechanisms. These models deliver better prediction and goodness-of-fit than their RNN-based counterparts. However, they often require high computational and memory complexity requirements and sometimes fail to adequately capture the triggering function of the underlying process. So motivated, we introduce an efficient and general encoding of the historical event sequence by replacing the complex (multilayered) attention structures with triggering kernels of the observed data. Noting the similarity between the triggering kernels of a point process and the attention scores, we use a triggering kernel to replace the weights used to build history representations. Our estimate for the triggering function is equipped with a sigmoid gating mechanism that captures local-in-time triggering effects that are otherwise challenging with standard decaying-over-time kernels. Further, taking both event type representations and temporal embeddings as inputs, the model learns the underlying triggering type-time kernel parameters given pairs of event types. We present experiments on synthetic and real data sets widely used by competing models, while further including a COVID-19 dataset to illustrate a scenario where longitudinal covariates are available. Results show the proposed model outperforms existing approaches while being more efficient in terms of computational complexity and yielding interpretable results via direct application of the newly introduced kernel.

下载PDF全文

下载文献需遵守相关版权规定

论文标题