论文标题
使用随机森林接近的公司债券的监督相似性学习
Supervised similarity learning for corporate bonds using Random Forest proximities
论文作者
论文摘要
金融文献包括关于金融资产和证券(例如股票,债券,共同基金等)的相似性和比较的充分研究。但是,由于金融数据集嘈杂,缺乏有用的功能,缺少数据并且经常缺乏地面真相或注释标签,因此超越相关性或总统计数据非常艰巨。但是,尽管从这些传统模型中推断出的相似性可能会在总体上很好地工作,例如在查看大型投资组合时的风险管理,但在用于投资组合建设和交易时,它们通常会失败,这需要在全球措施之上进行本地和动态的相似性。在本文中,我们建议对公司债券进行监督相似性框架,该框架允许根据本地和全球措施进行推论。从机器学习的角度来看,本文强调了通常被视为监督学习算法的随机森林(RF)也可以用作相似性学习(更具体地说,是远程度量学习)算法。此外,该框架提出了一种新的度量来评估相似性,并分析了其他指标,该指标进一步表明,在这项工作中,RF胜过所有其他方法。
Financial literature consists of ample research on similarity and comparison of financial assets and securities such as stocks, bonds, mutual funds, etc. However, going beyond correlations or aggregate statistics has been arduous since financial datasets are noisy, lack useful features, have missing data and often lack ground truth or annotated labels. However, though similarity extrapolated from these traditional models heuristically may work well on an aggregate level, such as risk management when looking at large portfolios, they often fail when used for portfolio construction and trading which require a local and dynamic measure of similarity on top of global measure. In this paper we propose a supervised similarity framework for corporate bonds which allows for inference based on both local and global measures. From a machine learning perspective, this paper emphasis that random forest (RF), which is usually viewed as a supervised learning algorithm, can also be used as a similarity learning (more specifically, a distance metric learning) algorithm. In addition, this framework proposes a novel metric to evaluate similarities, and analyses other metrics which further demonstrate that RF outperforms all other methods experimented with, in this work.