论文标题

主动学习-AS-A-Service:以数据为中心AI的自动有效的MLOPS系统

Active-Learning-as-a-Service: An Automatic and Efficient MLOps System for Data-Centric AI

论文作者

Huang, Yizheng, Zhang, Huaizheng, Li, Yuanming, Lau, Chiew Tong, You, Yang

论文摘要

当今AI应用程序的成功不仅需要模型培训(以模型为中心),还需要数据工程(以数据为中心)。在以数据为中心的AI中,主动学习(AL)起着至关重要的作用,但当前的AL工具1)要求用户手动选择AL策略,而2)无法有效执行AL任务。为此,本文介绍了AL的自动有效的MLOPS系统,名为Alaas(Active-Learning-As-a-Service)。具体来说,1)ALAAS实施AL代理,包括性能预测指标和工作流控制器,以决定用户的数据集和预算,以决定最合适的AL策略。我们称其为基于预测的连续减半早期(PSHEA)程序。 2)ALAAS采用服务器客户式体系结构来支持AL管道,并实现阶段级并行性以提高效率。同时,使用缓存和批处理技术进一步加速了AL过程。除效率外,ALAAS还可以借助于配置的设计理念来确保可访问性。广泛的实验表明,在潜伏期和吞吐量方面,ALAAS优于所有其他基线。此外,在AL代理的指导下,Alaas可以自动为不同数据集和预算下的非专家用户选择并运行AL策略。我们的代码可在\ url {https://github.com/mlsysops/active-learning-as-a-service}中获得。

The success of today's AI applications requires not only model training (Model-centric) but also data engineering (Data-centric). In data-centric AI, active learning (AL) plays a vital role, but current AL tools 1) require users to manually select AL strategies, and 2) can not perform AL tasks efficiently. To this end, this paper presents an automatic and efficient MLOps system for AL, named ALaaS (Active-Learning-as-a-Service). Specifically, 1) ALaaS implements an AL agent, including a performance predictor and a workflow controller, to decide the most suitable AL strategies given users' datasets and budgets. We call this a predictive-based successive halving early-stop (PSHEA) procedure. 2) ALaaS adopts a server-client architecture to support an AL pipeline and implements stage-level parallelism for high efficiency. Meanwhile, caching and batching techniques are employed to further accelerate the AL process. In addition to efficiency, ALaaS ensures accessibility with the help of the design philosophy of configuration-as-a-service. Extensive experiments show that ALaaS outperforms all other baselines in terms of latency and throughput. Also, guided by the AL agent, ALaaS can automatically select and run AL strategies for non-expert users under different datasets and budgets. Our code is available at \url{https://github.com/MLSysOps/Active-Learning-as-a-Service}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源