论文标题

用很少的学习对星系形态进行分类

Classifying Galaxy Morphologies with Few-Shot Learning

论文作者

Zhang, Zhirui, Zou, Zhiqiang, Li, Nan, Chen, Yanli

论文摘要

银河形态的分类学在天体物理学中至关重要,因为形态学特性是星系进化的强大示踪剂。随着即将进行的大规模成像调查,数十亿个星系图像通过应用传统方法或人类检查来挑战天文学家来完成分类任务。因此,由于其出色的自动化,效率和准确性,机器学习,特别是受监督的深度学习,最近已被广泛用于对星系形态进行分类。但是,有监督的深度学习需要广泛的培训集,这会导致大量工作量;同样,结果在很大程度上取决于训练集的特征,这可能导致结果偏见。在这项研究中,我们尝试绕过这两个问题,尝试进行很少的学习。我们的研究采用了Kaggle的Galaxy Zoo挑战项目的数据集,并根据相应的真实表将其分为五类。通过对上述数据集进行分类,利用基于暹罗网络的少量学习,并根据Alexnet,VGG_16和RESNET_50进行了深入学习,并分别接受了不同培训的培训,我们发现,在大多数情况下,几乎没有训练的学习能够在大多数情况下获得最高的准确性,并且在培训中,与Alexnet相比,最大的改善是$ 21 \%的培训。此外,为了确保准确性不少于90 \%,很少需要学习$ \ sim $ \ sim $ 6300用于培训,而resnet_50需要13000张图像。考虑到上面所述的优点,尽管仅由观察数据组成的训练集有限,但很少有射击学习适用于星系形态的分类法,甚至用于识别罕见的天体物理对象。

The taxonomy of galaxy morphology is critical in astrophysics as the morphological properties are powerful tracers of galaxy evolution. With the upcoming Large-scale Imaging Surveys, billions of galaxy images challenge astronomers to accomplish the classification task by applying traditional methods or human inspection. Consequently, machine learning, in particular supervised deep learning, has been widely employed to classify galaxy morphologies recently due to its exceptional automation, efficiency, and accuracy. However, supervised deep learning requires extensive training sets, which causes considerable workloads; also, the results are strongly dependent on the characteristics of training sets, which leads to biased outcomes potentially. In this study, we attempt Few-shot Learning to bypass the two issues. Our research adopts the dataset from Galaxy Zoo Challenge Project on Kaggle, and we divide it into five categories according to the corresponding truth table. By classifying the above dataset utilizing few-shot learning based on Siamese Networks and supervised deep learning based on AlexNet, VGG_16, and ResNet_50 trained with different volumes of training sets separately, we find that few-shot learning achieves the highest accuracy in most cases, and the most significant improvement is $21\%$ compared to AlexNet when the training sets contain 1000 images. In addition, to guarantee the accuracy is no less than 90\%, few-shot learning needs $\sim$6300 images for training, while ResNet_50 requires 13000 images. Considering the advantages stated above, foreseeably, few-shot learning is suitable for the taxonomy of galaxy morphology and even for identifying rare astrophysical objects, despite limited training sets consisting of observational data only.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源