论文标题
用自动编码变分贝叶斯的决策
Decision-Making with Auto-Encoding Variational Bayes
论文作者
论文摘要
为了基于与自动编码的变分贝叶斯(AEVB)拟合的模型做出决策,从业者通常让变分分布用作后验分布的替代物。这种方法得出了对预期风险的偏见估计,因此导致决定不当,原因有两个。首先,与AEVB拟合的模型可能不等于基础数据分布。其次,变异分布可能不等于拟合模型下的后验分布。我们探索如何基于ELBO以外的几个目标功能拟合变异分布,同时继续基于ELBO的生成模型,会影响下游决策的质量。对于概率主组件分析模型,我们研究了重要性采样误差以及模型参数估计的偏差如何在用作建议分布时在几个近似后代的情况下变化。我们的理论结果表明,应使用与变异分布不同的后近似值来做出决策。在这些理论结果的推动下,我们提出了学习最佳模型的几个近似建议,并使用多重重要性抽样进行决策。除了玩具示例外,我们还提出了单细胞RNA测序的全面案例研究。在这种挑战性的多种假设检验的实例中,我们提出的方法超过了当前的艺术状态。
To make decisions based on a model fit with auto-encoding variational Bayes (AEVB), practitioners often let the variational distribution serve as a surrogate for the posterior distribution. This approach yields biased estimates of the expected risk, and therefore leads to poor decisions for two reasons. First, the model fit with AEVB may not equal the underlying data distribution. Second, the variational distribution may not equal the posterior distribution under the fitted model. We explore how fitting the variational distribution based on several objective functions other than the ELBO, while continuing to fit the generative model based on the ELBO, affects the quality of downstream decisions. For the probabilistic principal component analysis model, we investigate how importance sampling error, as well as the bias of the model parameter estimates, varies across several approximate posteriors when used as proposal distributions. Our theoretical results suggest that a posterior approximation distinct from the variational distribution should be used for making decisions. Motivated by these theoretical results, we propose learning several approximate proposals for the best model and combining them using multiple importance sampling for decision-making. In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing. In this challenging instance of multiple hypothesis testing, our proposed approach surpasses the current state of the art.