论文标题
贝叶斯的归因具有最佳的外观偏见和差异权衡
Bayesian Imputation with Optimal Look-Ahead-Bias and Variance Tradeoff
论文作者
论文摘要
在运营管理,医疗保健和金融的许多规范分析模型中,缺失的时间序列数据是一个普遍的问题。时间序列数据的插补方法通常应用于完整的面板数据,目的是训练针对下游样本外任务的规范模型。例如,在估计最佳投资组合分配之前,可以应用丢失资产回报的归纳。但是,这种做法可能会导致下游任务的未来表现的偏见,并且在将整个数据集用于归纳的偏见与仅使用数据集的训练部分进行插补的较大差异之间存在固有的权衡。通过连接及时揭示的信息层,我们提出了一个贝叶斯共识的后部,该共识融合了任意数量的后代,以优化插补中的差异和偏见的权衡。我们得出可拖动的两步优化程序,以找到最佳的后部共识,并具有kullback-leibler Divergence,Wasserstein距离是后验分布之间的差异度量。我们在模拟和实证研究中证明了我们的归纳机制用于投资组合分配的好处,而回报率却缺失。
Missing time-series data is a prevalent problem in many prescriptive analytics models in operations management, healthcare and finance. Imputation methods for time-series data are usually applied to the full panel data with the purpose of training a prescriptive model for a downstream out-of-sample task. For example, the imputation of missing asset returns may be applied before estimating an optimal portfolio allocation. However, this practice can result in a look-ahead-bias in the future performance of the downstream task, and there is an inherent trade-off between the look-ahead-bias of using the entire data set for imputation and the larger variance of using only the training portion of the data set for imputation. By connecting layers of information revealed in time, we propose a Bayesian consensus posterior that fuses an arbitrary number of posteriors to optimize the variance and look-ahead-bias trade-off in the imputation. We derive tractable two-step optimization procedures for finding the optimal consensus posterior, with Kullback-Leibler divergence and Wasserstein distance as the dissimilarity measure between posterior distributions. We demonstrate in simulations and in an empirical study the benefit of our imputation mechanism for portfolio allocation with missing returns.