论文标题
使用等级改革者模型在财务监管文件中发现物质信息
Discovering material information using hierarchical Reformer model on financial regulatory filings
论文作者
论文摘要
机器学习用于金融的大多数应用都与投资决策的预测任务有关。相反,我们旨在通过机器学习技术来更好地了解金融市场。利用自然语言处理的深度学习模型的巨大进展,我们构建了一个分层改革者([15])模型,能够从加拿大金融监管文件中处理大型文档级别数据集Sedar。使用此模型,我们表明可以使用法规申请来预测贸易量的变化。我们调整了Hibert([36])的训练术任务,以使用大型未标记文档数据集获得良好的句子级别表示。对模型进行填充以成功预测贸易量的变化表明,该模型从金融市场捕获了一项观点,而处理监管申请是有益的。分析我们模型的注意力模式表明,它能够在没有明确培训的情况下检测一些物质信息的迹象,这与投资者以及对金融监管机构的市场监视任务高度相关。
Most applications of machine learning for finance are related to forecasting tasks for investment decisions. Instead, we aim to promote a better understanding of financial markets with machine learning techniques. Leveraging the tremendous progress in deep learning models for natural language processing, we construct a hierarchical Reformer ([15]) model capable of processing a large document level dataset, SEDAR, from canadian financial regulatory filings. Using this model, we show that it is possible to predict trade volume changes using regulatory filings. We adapt the pretraining task of HiBERT ([36]) to obtain good sentence level representations using a large unlabelled document dataset. Finetuning the model to successfully predict trade volume changes indicates that the model captures a view from financial markets and processing regulatory filings is beneficial. Analyzing the attention patterns of our model reveals that it is able to detect some indications of material information without explicit training, which is highly relevant for investors and also for the market surveillance mandate of financial regulators.