论文标题
大规模实验的经验贝叶斯多阶段测试
Empirical Bayes Multistage Testing for Large-Scale Experiments
论文作者
论文摘要
A/B测试的现代应用是由于其在各个维度上的大规模范围而具有挑战性的,这需要灵活性依次处理多个测试。最先进的实践首先将观察到的数据流降低到始终可valid p值,然后选择像常规多重测试方案一样选择截止值。在这里,我们提出了一种称为Amset(自适应多阶段经验贝叶斯测试)的替代方法,通过将历史数据纳入决策中,以实现效率提高,同时保留了不受窥视性能的边际虚假发现率(MFDR)控制。我们还表明,在大型移动应用社交网络公司中,Amset中完全数据驱动的估计可在各种模拟和真实数据设置中执行。
Modern application of A/B tests is challenging due to its large scale in various dimensions, which demands flexibility to deal with multiple testing sequentially. The state-of-the-art practice first reduces the observed data stream to always-valid p-values, and then chooses a cut-off as in conventional multiple testing schemes. Here we propose an alternative method called AMSET (adaptive multistage empirical Bayes test) by incorporating historical data in decision-making to achieve efficiency gains while retaining marginal false discovery rate (mFDR) control that is immune to peeking. We also show that a fully data-driven estimation in AMSET performs robustly to various simulation and real data settings at a large mobile app social network company.