论文标题
部分可观测时空混沌系统的无模型预测
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
论文作者
论文摘要
无数据量化可以潜在地解决模型压缩中的数据隐私和安全问题,因此已广泛研究。最近,PSAQ-VIT设计了一个相对值度量标准,贴片相似性,以生成预先训练的视觉变压器(VIT)的数据,从而实现了VIT的第一次无数据量化尝试。在本文中,我们提出了PSAQ-VIT V2,这是在PSAQ-VIT之上建立的更准确,无数据的VIT的更准确和一般的无数据量化框架。更具体地说,在PSAQ-VIT中的补丁相似性度量之后,我们引入了一种适应性的教师学生策略,以促进生成的样品的持续循环演化,并以竞争性和互动方式(在全精度模型(教师)的监督下)进行竞争和互动方式,从而显着提高了量化模型的精度。此外,如果没有辅助类别指导,我们就采用了任务和模型与模型的先验信息,从而使通用方案与广泛的视觉任务和模型兼容。对图像分类,对象检测和语义分割任务以及PSAQ-VIT V2的各种模型进行了广泛的实验,具有幼稚的量化策略,没有访问现实世界数据,一致地取得了竞争性的结果,表明了对VIT的数据无数据量化的强大基线。例如,使用SWIN-S作为(骨干)模型,8位量化达到ImageNet上的82.13 TOP-1精度,50.9 Box AP和可可的44.1掩码AP,而ADE20K上的47.2 miOU。我们希望准确,一般的PSAQ-VIT V2可以作为涉及敏感数据的现实应用程序中的潜在和实践解决方案。代码发布并在以下位置合并:https://github.com/zkkli/psaq-vit。
Data-free quantization can potentially address data privacy and security concerns in model compression, and thus has been widely investigated. Recently, PSAQ-ViT designs a relative value metric, patch similarity, to generate data from pre-trained vision transformers (ViTs), achieving the first attempt at data-free quantization for ViTs. In this paper, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT. More specifically, following the patch similarity metric in PSAQ-ViT, we introduce an adaptive teacher-student strategy, which facilitates the constant cyclic evolution of the generated samples and the quantized model (student) in a competitive and interactive fashion under the supervision of the full-precision model (teacher), thus significantly improving the accuracy of the quantized model. Moreover, without the auxiliary category guidance, we employ the task- and model-independent prior information, making the general-purpose scheme compatible with a broad range of vision tasks and models. Extensive experiments are conducted on various models on image classification, object detection, and semantic segmentation tasks, and PSAQ-ViT V2, with the naive quantization strategy and without access to real-world data, consistently achieves competitive results, showing potential as a powerful baseline on data-free quantization for ViTs. For instance, with Swin-S as the (backbone) model, 8-bit quantization reaches 82.13 top-1 accuracy on ImageNet, 50.9 box AP and 44.1 mask AP on COCO, and 47.2 mIoU on ADE20K. We hope that accurate and general PSAQ-ViT V2 can serve as a potential and practice solution in real-world applications involving sensitive data. Code is released and merged at: https://github.com/zkkli/PSAQ-ViT.