论文标题
SYSSCALE:用于节能移动处理器的多域动态电压和频率缩放
SysScale: Exploiting Multi-domain Dynamic Voltage and Frequency Scaling for Energy Efficient Mobile Processors
论文作者
论文摘要
现代热约束的移动系统片(SOC)中有三个域:计算,IO和内存。我们观察到,现代社会通常将固定的功率预算分配给IO和内存域,即使没有充分利用,也对应于最差的案例性能需求。跨域跨域的功率预算的不公平分配可能会引起两个主要问题:1)IO和内存域可以以更高的频率和电压运行,而不是必要的功率,并且增加了功耗,2)IO和内存域的未使用的功率预算不能用于增加计算域的吞吐量,以妨碍了性能。为了避免这些问题,至关重要的是,基于其实际绩效需求,动态地在这三个领域进行SOC Power预算的分布至关重要。 我们提出了Sysscale,这是一种新的多域电源管理技术,以提高移动SOC的能源效率。 Sysscale是基于三个关键想法的。首先,Sysscale引入了一种准确的算法,以预测三个SOC域的性能(例如带宽和潜伏期)的需求。其次,Sysscale使用新的DVF(动态电压和频率缩放)机制,根据预测的性能需求将SOC功率分配到每个域。第三,除了使用全局DVFS机制外,Sysscale还使用域特有的技术来优化不同操作点下每个域的能源效率。 我们在Intel Skylake微处理器上为移动设备实施Sysscale,并使用各种规格CPU2006,Graphics(3DMark)和电池寿命工作负载(例如,视频播放)对其进行评估。在2核Skylake上,Sysscale将SPEC CPU2006的性能和3DMark工作负载提高了16%和8.9%(平均9.2%和7.9%)。
There are three domains in a modern thermally-constrained mobile system-on-chip (SoC): compute, IO, and memory. We observe that a modern SoC typically allocates a fixed power budget, corresponding to worst-case performance demands, to the IO and memory domains even if they are underutilized. The resulting unfair allocation of the power budget across domains can cause two major issues: 1) the IO and memory domains can operate at a higher frequency and voltage than necessary, increasing power consumption and 2) the unused power budget of the IO and memory domains cannot be used to increase the throughput of the compute domain, hampering performance. To avoid these issues, it is crucial to dynamically orchestrate the distribution of the SoC power budget across the three domains based on their actual performance demands. We propose SysScale, a new multi-domain power management technique to improve the energy efficiency of mobile SoCs. SysScale is based on three key ideas. First, SysScale introduces an accurate algorithm to predict the performance (e.g., bandwidth and latency) demands of the three SoC domains. Second, SysScale uses a new DVFS (dynamic voltage and frequency scaling) mechanism to distribute the SoC power to each domain according to the predicted performance demands. Third, in addition to using a global DVFS mechanism, SysScale uses domain-specialized techniques to optimize the energy efficiency of each domain at different operating points. We implement SysScale on an Intel Skylake microprocessor for mobile devices and evaluate it using a wide variety of SPEC CPU2006, graphics (3DMark), and battery life workloads (e.g., video playback). On a 2-core Skylake, SysScale improves the performance of SPEC CPU2006 and 3DMark workloads by up to 16% and 8.9% (9.2% and 7.9% on average), respectively.