论文标题

通过高斯工艺分离星形 - 加拉克斯的模型降低

Star-Galaxy Separation via Gaussian Processes with Model Reduction

论文作者

Goumiri, Imène R., Muyskens, Amanda L., Schneider, Michael D., Priest, Benjamin W., Armstrong, Robert E.

论文摘要

现代宇宙学调查,例如超级摄像机(HSC)调查,在我们自己的星系中产生了大量的遥远星系和昏暗恒星的低分辨率图像。能够自动对这些图像进行分类是天文学的长期问题,对于许多不同的科学分析至关重要。最近,深度神经网络(DNN)擅长学习复杂的非线性嵌入,从而解决了“星形 - 加拉克斯”分类的挑战。但是,已知DNN在看不见的数据上推断出过度关注,并且需要大量的训练图像,以准确捕获数据分布以被认为是可靠的。高斯工艺(GPS)推断后验分布而不是函数并自然量化不确定性,这并不是该任务的选择工具,主要是因为流行的内核对复杂和高维数据表现出有限的表达性。 在本文中,我们提出了一种新颖的方法,用于使用GPS的星形分离问题,同时解决了传统上影响它们的许多问题,以分类高维天体图像数据。在对恒星和星系图像切口的原始数据进行初始过滤后,我们首先使用主成分分析(PCA)降低输入图像的维度,然后再使用简单的径向基函数(RBF)内核在还原数据上应用GPS。使用这种方法,我们在提高方法的计算效率和可扩展性的同时,大大提高了分类的准确性。

Modern cosmological surveys such as the Hyper Suprime-Cam (HSC) survey produce a huge volume of low-resolution images of both distant galaxies and dim stars in our own galaxy. Being able to automatically classify these images is a long-standing problem in astronomy and critical to a number of different scientific analyses. Recently, the challenge of "star-galaxy" classification has been approached with Deep Neural Networks (DNNs), which are good at learning complex nonlinear embeddings. However, DNNs are known to overconfidently extrapolate on unseen data and require a large volume of training images that accurately capture the data distribution to be considered reliable. Gaussian Processes (GPs), which infer posterior distributions over functions and naturally quantify uncertainty, haven't been a tool of choice for this task mainly because popular kernels exhibit limited expressivity on complex and high-dimensional data. In this paper, we present a novel approach to the star-galaxy separation problem that uses GPs and reap their benefits while solving many of the issues traditionally affecting them for classification of high-dimensional celestial image data. After an initial filtering of the raw data of star and galaxy image cutouts, we first reduce the dimensionality of the input images by using a Principal Components Analysis (PCA) before applying GPs using a simple Radial Basis Function (RBF) kernel on the reduced data. Using this method, we greatly improve the accuracy of the classification over a basic application of GPs while improving the computational efficiency and scalability of the method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源