论文标题
一个人:用于建立HVAC控制的转移学习
One for Many: Transfer Learning for Building HVAC Control
论文作者
论文摘要
建筑物供暖,通风和空调(HVAC)系统的设计至关重要,因为它占建筑能源消耗的一半,并直接影响居住者的舒适性,生产力和健康。传统的HVAC控制方法通常基于为建立热力动力学创建明确的物理模型,这通常需要大量的努力来开发,并且难以实现足够的准确性和效率,以实现运行时构建控制和实现现场实现的可扩展性。最近,深入增强学习(DRL)已成为一种有希望的数据驱动方法,它在不分析运行时分析物理模型的情况下提供了良好的控制性能。但是,对DRL(以及许多其他数据驱动的学习方法)的主要挑战是达到所需表现所需的漫长训练时间。在这项工作中,我们提出了一种基于转移学习的新方法来克服这一挑战。我们的方法可以通过将神经网络控制器的设计分解为可转移的前端网络,从而有效地将基于DRL的HVAC控制器转移到源建筑物的控制器上,以最小的努力和改善的性能,以捕获建筑物不可或缺的行为,并可以有效地培训每个特定建筑物的后端网络。我们对具有不同尺寸的建筑物,热区,材料和布局,空调类型和环境天气条件之间的各种转移方案进行了实验。实验结果证明了我们方法在大大减少训练时间,能源成本和违反温度的情况方面的有效性。
The design of building heating, ventilation, and air conditioning (HVAC) system is critically important, as it accounts for around half of building energy consumption and directly affects occupant comfort, productivity, and health. Traditional HVAC control methods are typically based on creating explicit physical models for building thermal dynamics, which often require significant effort to develop and are difficult to achieve sufficient accuracy and efficiency for runtime building control and scalability for field implementations. Recently, deep reinforcement learning (DRL) has emerged as a promising data-driven method that provides good control performance without analyzing physical models at runtime. However, a major challenge to DRL (and many other data-driven learning methods) is the long training time it takes to reach the desired performance. In this work, we present a novel transfer learning based approach to overcome this challenge. Our approach can effectively transfer a DRL-based HVAC controller trained for the source building to a controller for the target building with minimal effort and improved performance, by decomposing the design of neural network controller into a transferable front-end network that captures building-agnostic behavior and a back-end network that can be efficiently trained for each specific building. We conducted experiments on a variety of transfer scenarios between buildings with different sizes, numbers of thermal zones, materials and layouts, air conditioner types, and ambient weather conditions. The experimental results demonstrated the effectiveness of our approach in significantly reducing the training time, energy cost, and temperature violations.