Popis: |
Optimal Learning Machines (OLM) are systems that extract maximally informative representation from data. At a given resolution, they maximise the relevance, which is the entropy of their energy distribution. We show that the relevance lower bounds the mutual information between the representation and the hidden features that it extracts from data. In order to understand their peculiar properties, we study $J_{i,j}=\pm J$ fully connected Ising models and contrast their properties with the Ising ferromagnet ($J_{i,j}=J$). The main finding is that optimal Ising learning machines are characterised by inhomogeneous distributions of couplings and that their relevance increases as $h_E\log n$ with the number $n$ of spins, with $h_E>1$. This contrasts with the behaviour of ferromagnets or sing glasses, that we argue have $h_E\le 1$. Learning performance is related to sub-extensive features of the models that are elusive to a thermodynamic treatment. Indeed, we find models with $h_E\ge 1$ that differ from the Ising ferromagnet by a sub-extensive number of couplings and that share the same thermodynamic properties with it. We also exhibit an architecture which suggests that superior learning performance does not require fine tuning to a critical point. The space of couplings of OLM is dominated by a large connected component. |