Exploring machine learning methods for the Star/Galaxy Separation Problem

Autor:	Luiz N. da Costa, Gustavo Paiva Guedes, Riccardo Campisano, Marcello Serqueira, Ricardo L. C. Ogando, M. A. G. Maia, Eduardo Ogasawara, Eduardo Jabbur Machado, Eduardo Bezerra
Rok vydání:	2016
Předmět:	Artificial neural network business.industry Computer science Deep learning Astrophysics::Cosmology and Extragalactic Astrophysics 02 engineering and technology Large Synoptic Survey Telescope computer.software_genre Astronomical survey Machine learning 01 natural sciences Galaxy Random forest Support vector machine Naive Bayes classifier 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Dark energy 020201 artificial intelligence & image processing Artificial intelligence Data mining business 010303 astronomy & astrophysics computer
Zdroj:	IJCNN
DOI:	10.1109/ijcnn.2016.7727189
Popis:	For recent or planned deep astronomical surveys, it is important to tell stars and galaxies apart, a task known as Star/Galaxy Separation Problem (SGSP). At faint magnitudes, the separation between pointy and extended sources is fuzzy, which makes SGSP a hard task. This problem is even harder for large surveys like Dark Energy Survey (DES) and, in a near future, the Large Synoptic Survey Telescope (LSST) due to their large data volume. Hence, the search for classification methods that are both accurate and efficient is highly relevant. In this work, we present a comparative analysis of several machine learning methods targeted at solving the SGSP at faint magnitudes. In order to train the classification models, the COSMOS survey was used. We use machine learning methods as distinct as artificial neural networks, k nearest-neighbor, Support Vector Machines, Random Forests and Naive Bayes. The exploratory process was modeled as data centric workflow. The workflow was implemented on top of Hadoop framework and was used to find the best parameter values for each classification method we considered, of which neural networks and random forest present superior performance.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::193d632c7ba08c2cccd4550906cf37ad https://doi.org/10.1109/ijcnn.2016.7727189 Zobrazit plný text záznamu