论文标题
来自情感信号的预测加密货币返回:BERT分类器和弱监督的分析
Forecasting Cryptocurrency Returns from Sentiment Signals: An Analysis of BERT Classifiers and Weak Supervision
论文作者
论文摘要
预期金融市场的价格发展是对预测的持续关注的话题。深度学习和自然语言处理(NLP)的进步以及以新闻文章,社交媒体帖子等形式的大量文本数据的可用性融合在一起,越来越多的研究将基于文本的预测指标纳入预测模型中。我们通过引入薄弱的学习来为这一文献做出贡献,这是一种最近提出的NLP方法,可以解决文本数据未标记的问题。没有因变量,就不可能在自定义语料库上进行验证的NLP验证模型。我们确认,使用弱标签的固定在预测加密货币回报的情况下提高了基于文本的特征的预测价值,并提高了预测精度。从根本上讲,我们提出的建模范式,弱标记域特异性文本和预审预定的NLP模型,普遍适用于(财务)预测,并解锁了利用文本数据的新方法。
Anticipating price developments in financial markets is a topic of continued interest in forecasting. Funneled by advancements in deep learning and natural language processing (NLP) together with the availability of vast amounts of textual data in form of news articles, social media postings, etc., an increasing number of studies incorporate text-based predictors in forecasting models. We contribute to this literature by introducing weak learning, a recently proposed NLP approach to address the problem that text data is unlabeled. Without a dependent variable, it is not possible to finetune pretrained NLP models on a custom corpus. We confirm that finetuning using weak labels enhances the predictive value of text-based features and raises forecast accuracy in the context of predicting cryptocurrency returns. More fundamentally, the modeling paradigm we present, weak labeling domain-specific text and finetuning pretrained NLP models, is universally applicable in (financial) forecasting and unlocks new ways to leverage text data.