平衡热舒适数据集：我们gan，但是我们应该吗？

论文标题

平衡热舒适数据集：我们gan，但是我们应该吗？

Balancing thermal comfort datasets: We GAN, but should we?

论文作者

Quintana, Matias, Schiavon, Stefano, Tham, Kwok Wai, Miller, Clayton

论文摘要

由于传感器和主观反馈方法的扩散，分析师和研究人员对建筑环境的热舒适度评估变得更加可行。这些数据可用于对舒适行为进行建模，以支持设计和操作，以实现能源效率和福祉。从本质上讲，由于室内条件是为了舒适而设计的，因此居住者的主观反馈是不平衡的，并且反应表明否则不太常见。这种情况为机器学习工作流程创建了一个方案，在该方案中，班级平衡作为预处理步骤可能对于开发具有高性能的预测热舒适分类模型很有价值。本文研究了文献中的各种热舒适数据集平衡技术，并提出了修改的条件生成对抗网络（GAN），$ \ texttt {comfortgan} $，以解决这种不平衡情况。这些方法应用于三个公开可用的数据集，范围从30和67个参与者到全球热舒适数据集的集合，其中1,474个； 2,067;和66,397个数据点。这项工作发现，在平衡数据集中训练的分类模型，该模型由$ \ texttt {comfortgan} $组成的真实和生成的样品组成，比其他测试的其他增强方法更高的性能（分类准确性增加了4％至17％）。但是，当将代表不适的课程合并并减少到三个时，预计表现会更好，并且$ \ texttt {comfortgan} $缩水的额外表现增加到1-2％。这些结果表明，使用诸如gan的先进技术，用于热舒适建模的类平衡是有益的，但是在某些情况下，其价值会降低。提供了讨论，以帮助潜在用户确定此过程有用，哪种方法最有用。

Thermal comfort assessment for the built environment has become more available to analysts and researchers due to the proliferation of sensors and subjective feedback methods. These data can be used for modeling comfort behavior to support design and operations towards energy efficiency and well-being. By nature, occupant subjective feedback is imbalanced as indoor conditions are designed for comfort, and responses indicating otherwise are less common. This situation creates a scenario for the machine learning workflow where class balancing as a pre-processing step might be valuable for developing predictive thermal comfort classification models with high-performance. This paper investigates the various thermal comfort dataset class balancing techniques from the literature and proposes a modified conditional Generative Adversarial Network (GAN), $\texttt{comfortGAN}$, to address this imbalance scenario. These approaches are applied to three publicly available datasets, ranging from 30 and 67 participants to a global collection of thermal comfort datasets, with 1,474; 2,067; and 66,397 data points, respectively. This work finds that a classification model trained on a balanced dataset, comprised of real and generated samples from $\texttt{comfortGAN}$, has higher performance (increase between 4% and 17% in classification accuracy) than other augmentation methods tested. However, when classes representing discomfort are merged and reduced to three, better imbalanced performance is expected, and the additional increase in performance by $\texttt{comfortGAN}$ shrinks to 1-2%. These results illustrate that class balancing for thermal comfort modeling is beneficial using advanced techniques such as GANs, but its value is diminished in certain scenarios. A discussion is provided to assist potential users in determining which scenarios this process is useful and which method works best.

下载PDF全文

下载文献需遵守相关版权规定

论文标题