论文标题
MLOS:自动软件性能工程的基础架构
MLOS: An Infrastructure for Automated Software Performance Engineering
论文作者
论文摘要
开发现代系统软件是一项复杂的任务,结合了业务逻辑编程和软件性能工程(SPE)。后者是一项实验和劳动密集型活动,旨在为给定的硬件,软件和工作负载(HW/SW/WL)上下文优化系统。 当今的SPE是由专业团队在构建/释放阶段进行的,并由以下诅咒:1)缺乏标准化和自动化工具,2)重复的重复工作,例如HW/SW/WL上下文更改,3)脆弱性,由“单尺寸拟合 - 全部 - 全部”调谐(在某个工作负载或组件上进行改进或组件可能会影响其他)。最终结果:尽管投资昂贵,但是系统软件通常不在其最佳操作点之外 - 有趣的是,桌面上的性能的30%至40%。 数据科学(DS)的最新发展暗示了一个机会:将DS工具和方法与新开发人员的经验相结合,以改变SPE的实践。在本文中,我们介绍:MLOS,一种由ML驱动的基础架构和方法来使软件性能工程民主化和自动化。 MLOS启用连续,实例级,稳健和可跟踪的系统优化。 MLOS正在Microsoft中开发和使用,以优化SQL Server性能。早期结果表明,当对特定的HW/SW/WL进行量身定制时,组件级的优化可以提高20%-90%,这暗示了一个很大的机会。但是,仍然存在一些研究挑战,需要社区参与。为此,我们正在开源MLOS核心基础架构,并且我们正在与学术机构互动,以围绕软件2.0和MLOS创意创建教育计划。
Developing modern systems software is a complex task that combines business logic programming and Software Performance Engineering (SPE). The later is an experimental and labor-intensive activity focused on optimizing the system for a given hardware, software, and workload (hw/sw/wl) context. Today's SPE is performed during build/release phases by specialized teams, and cursed by: 1) lack of standardized and automated tools, 2) significant repeated work as hw/sw/wl context changes, 3) fragility induced by a "one-size-fit-all" tuning (where improvements on one workload or component may impact others). The net result: despite costly investments, system software is often outside its optimal operating point - anecdotally leaving 30% to 40% of performance on the table. The recent developments in Data Science (DS) hints at an opportunity: combining DS tooling and methodologies with a new developer experience to transform the practice of SPE. In this paper we present: MLOS, an ML-powered infrastructure and methodology to democratize and automate Software Performance Engineering. MLOS enables continuous, instance-level, robust, and trackable systems optimization. MLOS is being developed and employed within Microsoft to optimize SQL Server performance. Early results indicated that component-level optimizations can lead to 20%-90% improvements when custom-tuning for a specific hw/sw/wl, hinting at a significant opportunity. However, several research challenges remain that will require community involvement. To this end, we are in the process of open-sourcing the MLOS core infrastructure, and we are engaging with academic institutions to create an educational program around Software 2.0 and MLOS ideas.