论文标题

评估计算机视觉中的数据集偏差

Assessing Dataset Bias in Computer Vision

论文作者

Deviyani, Athiya

论文摘要

有偏见的数据集是一个数据集,通常具有不均匀类别的属性。这些偏见倾向于传播训练它们的模型,通常会导致少数群体的表现不佳。在该项目中,我们将探讨各种数据增强方法减轻数据集中固有偏见的程度。我们将在UTKFACE数据集的样本上应用多种增强技术,例如底面采样,几何变换,变分自动编码器(VAE)和生成的对抗网络(GAN)。然后,我们为每个增强数据集培训了一个分类器,并评估了其在本机测试集和外部面部识别数据集上的性能。我们还将其性能与在Fairface数据集中训练的最新属性分类器进行了比较。通过实验,我们能够发现在Stargan生成的图像上训练该模型导致了最佳的整体性能。我们还发现,对几何转换图像的培训会导致相似的性能,并具有更快的训练时间。此外,最佳性能模型还表现出每个属性中的类别均匀的性能。这表明该模型还能够减轻基线模型中对原始训练集训练的偏差。最后,我们能够证明与Fairface模型相比,我们的模型在多个数据集上具有更好的整体性能和一致性。我们的最终模型分别在91.75%,91.30%和87.20%的UTKFACE测试集上分别为性别,年龄和种族属性,其标准偏差小于每个属性的准确性之间的标准偏差小于0.1。

A biased dataset is a dataset that generally has attributes with an uneven class distribution. These biases have the tendency to propagate to the models that train on them, often leading to a poor performance in the minority class. In this project, we will explore the extent to which various data augmentation methods alleviate intrinsic biases within the dataset. We will apply several augmentation techniques on a sample of the UTKFace dataset, such as undersampling, geometric transformations, variational autoencoders (VAEs), and generative adversarial networks (GANs). We then trained a classifier for each of the augmented datasets and evaluated their performance on the native test set and on external facial recognition datasets. We have also compared their performance to the state-of-the-art attribute classifier trained on the FairFace dataset. Through experimentation, we were able to find that training the model on StarGAN-generated images led to the best overall performance. We also found that training on geometrically transformed images lead to a similar performance with a much quicker training time. Additionally, the best performing models also exhibit a uniform performance across the classes within each attribute. This signifies that the model was also able to mitigate the biases present in the baseline model that was trained on the original training set. Finally, we were able to show that our model has a better overall performance and consistency on age and ethnicity classification on multiple datasets when compared with the FairFace model. Our final model has an accuracy on the UTKFace test set of 91.75%, 91.30%, and 87.20% for the gender, age, and ethnicity attribute respectively, with a standard deviation of less than 0.1 between the accuracies of the classes of each attribute.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源