论文标题

使用新闻情绪分析和Spark中的技术指标预测股票趋势

Predicting The Stock Trend Using News Sentiment Analysis and Technical Indicators in Spark

论文作者

Kabbani, Taylan, Usta, Fatih Enes

论文摘要

由于其运动受到许多因素的影响,因此预测股市趋势一直是具有挑战性的。在这里,我们通过创建明天trend功能作为我们的标签,将未来趋势预测问题作为机器学习分类问题。具有不同的功能来帮助机器学习模型预测特定日期的标签;无论是上升趋势还是下降趋势,这些功能都是股票价格历史记录产生的技术指标。此外,由于金融新闻在改变投资者的行为中起着至关重要的作用,因此,当天发布的所有新闻创建了给定日的整体情感得分,并将其添加到模型中。在Spark(大数据计算平台),逻辑回归,随机森林和梯度提升机中测试了三种不同的机器学习模型。随机森林是最佳性能模型,测试精度为63.58%。

Predicting the stock market trend has always been challenging since its movement is affected by many factors. Here, we approach the future trend prediction problem as a machine learning classification problem by creating tomorrow_trend feature as our label to be predicted. Different features are given to help the machine learning model predict the label of a given day; whether it is an uptrend or downtrend, those features are technical indicators generated from the stock's price history. In addition, as financial news plays a vital role in changing the investor's behavior, the overall sentiment score on a given day is created from all news released on that day and added to the model as another feature. Three different machine learning models are tested in Spark (big-data computing platform), Logistic Regression, Random Forest, and Gradient Boosting Machine. Random Forest was the best performing model with a 63.58% test accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源