论文标题
通过增强学习的真实世界视频改编
Real-world Video Adaptation with Reinforcement Learning
论文作者
论文摘要
客户端视频播放器采用自适应比特量(ABR)算法来优化体验用户质量(QOE)。我们评估了最近在Facebook的基于Web的视频流平台中提出的基于RL的ABR方法。现实世界中的ABR包含了几个挑战,这些挑战需要超越现成的RL算法的定制设计 - 我们实施了可扩展的神经网络体系结构,该架构支持使用任意比特率编码的视频;我们设计了一种训练方法来应对网络条件下的随机性导致的差异;我们利用受约束的贝叶斯优化来奖励成型,以优化相互冲突的QoE目标。在为期一周的全球部署中,有超过3000万次视频流媒体会话,我们的RL方法的表现优于现有的人工设计的ABR算法。
Client-side video players employ adaptive bitrate (ABR) algorithms to optimize user quality of experience (QoE). We evaluate recently proposed RL-based ABR methods in Facebook's web-based video streaming platform. Real-world ABR contains several challenges that requires customized designs beyond off-the-shelf RL algorithms -- we implement a scalable neural network architecture that supports videos with arbitrary bitrate encodings; we design a training method to cope with the variance resulting from the stochasticity in network conditions; and we leverage constrained Bayesian optimization for reward shaping in order to optimize the conflicting QoE objectives. In a week-long worldwide deployment with more than 30 million video streaming sessions, our RL approach outperforms the existing human-engineered ABR algorithms.