Machine learning astrophysics from 21 cm lightcones: impact of network architectures and signal contamination

Autor: Steven G. Murray, Giuseppe Fiameni, Andrei Mesinger, David Prelogovic, Nicolas Gillet
Přispěvatelé: Prelogovic, D., Mesinger, A., Murray, S., Fiameni, G., Gillet, N., Observatoire astronomique de Strasbourg (ObAS), Université de Strasbourg (UNISTRA)-Institut national des sciences de l'Univers (INSU - CNRS)-Centre National de la Recherche Scientifique (CNRS)
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Monthly Notices of the Royal Astronomical Society
Monthly Notices of the Royal Astronomical Society, Oxford University Press (OUP): Policy P-Oxford Open Option A, 2022, 509 (3), pp.3852-3867. ⟨10.1093/mnras/stab3215⟩
ISSN: 0035-8711
1365-2966
DOI: 10.1093/mnras/stab3215⟩
Popis: Imaging the cosmic 21 cm signal will map out the first billion years of our Universe. The resulting 3D lightcone (LC) will encode the properties of the unseen first galaxies and physical cosmology. Here, we build on previous work using neural networks (NNs) to infer astrophysical parameters directly from 21 cm LC images. We introduce recurrent neural networks (RNNs), capable of efficiently characterizing the evolution along the redshift axis of 21 cm LC images. Using a large database of simulated cosmic 21 cm LCs, we compare the relative performance in parameter estimation of different network architectures. These including two types of RNNs, which differ in their complexity, as well as a more traditional convolutional neural network (CNN). For the ideal case of no instrumental effects, our simplest and easiest to train RNN performs the best, with a mean squared parameter estimation error (MSE) that is lower by a factor of $\ge 2$ compared with the other architectures studied here, and a factor of $\ge 8$ lower than the previously-studied CNN. We also corrupt the cosmic signal by adding noise expected from a 1000 h integration with the Square Kilometre Array, as well as excising a foreground-contaminated 'horizon wedge'. Parameter prediction errors increase when the NNs are trained on these contaminated LC images, though recovery is still good even in the most pessimistic case (with $R^2 \ge 0.5-0.95$). However, we find no notable differences in performance between network architectures on the contaminated images. We argue this is due to the size of our data set, highlighting the need for larger data sets and/or better data augmentation in order to maximize the potential of NNs in 21 cm parameter estimation.
15 pages, 11 figures, updated to match the version published in MNRAS, minor changes
Databáze: OpenAIRE