论文标题

有效的测试时间模型适应而无需忘记

Efficient Test-Time Model Adaptation without Forgetting

论文作者

Niu, Shuaicheng, Wu, Jiaxiang, Zhang, Yifan, Chen, Yaofo, Zheng, Shijian, Zhao, Peilin, Tan, Mingkui

论文摘要

测试时间适应(TTA)试图通过调整给定的W.R.T.来解决培训和测试数据之间的潜在分布变化。任何测试样本。当测试环境经常变化时,此任务对于深层模型尤为重要。尽管最近尝试处理这项任务,但我们仍然面临两个实际的挑战:1)现有方法必须对每个测试样本执行向后计算,从而导致许多应用程序的预测成本难以忍受; 2)尽管现有的TTA解决方案可以显着改善分布数据的测试性能,但在TTA之后,它们通常会遭受严重的性能降解(称为灾难性遗忘)。在本文中,我们指出,并非所有的测试样本都对模型适应均同样贡献,而高渗透性可能会导致可能破坏模型的嘈杂梯度。在此激励的情况下,我们提出了一个主动样本选择标准,以识别可靠和非冗余的样本,该样本在其上进行了更新,以最大程度地减少测试时间适应的熵损失。此外,为了减轻遗忘问题,我们引入了一个Fisher Rodanizer,以从急剧变化中限制重要的模型参数,其中Fisher的重要性是从带有伪标签的测试样本中估算出来的。在CIFAR-10-C,Imagenet-C和Imagenet-R上进行了广泛的实验,验证了我们提出的方法的有效性。

Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and testing data by adapting a given model w.r.t. any testing sample. This task is particularly important for deep models when the test environment changes frequently. Although some recent attempts have been made to handle this task, we still face two practical challenges: 1) existing methods have to perform backward computation for each test sample, resulting in unbearable prediction cost to many applications; 2) while existing TTA solutions can significantly improve the test performance on out-of-distribution data, they often suffer from severe performance degradation on in-distribution data after TTA (known as catastrophic forgetting). In this paper, we point out that not all the test samples contribute equally to model adaptation, and high-entropy ones may lead to noisy gradients that could disrupt the model. Motivated by this, we propose an active sample selection criterion to identify reliable and non-redundant samples, on which the model is updated to minimize the entropy loss for test-time adaptation. Furthermore, to alleviate the forgetting issue, we introduce a Fisher regularizer to constrain important model parameters from drastic changes, where the Fisher importance is estimated from test samples with generated pseudo labels. Extensive experiments on CIFAR-10-C, ImageNet-C, and ImageNet-R verify the effectiveness of our proposed method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源