Complex distributions emerging in filtering and compression
Autor: | G. J. Baxter, R. A. da Costa, Sergey N. Dorogovtsev, José F. F. Mendes |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
Spin glass
Artificial neural network Computer science Physics QC1-999 FOS: Physical sciences General Physics and Astronomy Probability and statistics Disordered Systems and Neural Networks (cond-mat.dis-nn) Filter (signal processing) Condensed Matter - Disordered Systems and Neural Networks 01 natural sciences Pseudorandom binary sequence 010305 fluids & plasmas Simple (abstract algebra) Compression (functional analysis) Physics - Data Analysis Statistics and Probability 0103 physical sciences 010306 general physics Algorithm Data Analysis Statistics and Probability (physics.data-an) |
Zdroj: | Physical Review X, Vol 10, Iss 1, p 011074 (2020) |
Popis: | In filtering, each output is produced by a certain number of different inputs. We explore the statistics of this degeneracy in an explicitly treatable filtering problem in which filtering performs the maximal compression of relevant information contained in inputs (arrays of zeroes and ones). This problem serves as a reference model for the statistics of filtering and related sampling problems. The filter patterns in this problem conveniently allow a microscopic, combinatorial consideration. This allows us to find the statistics of outputs, namely the exact distribution of output degeneracies, for arbitrary input sizes. We observe that the resulting degeneracy distribution of outputs decays as $e^{-c\log^\alpha \!d}$ with degeneracy $d$, where $c$ is a constant and exponent $\alpha>1$, i.e. faster than a power law. Importantly, its form essentially depends on the size of the input data set, appearing to be closer to a power-law dependence for small data set sizes than for large ones. We demonstrate that for sufficiently small input data set sizes typical for empirical studies, this distribution could be easily perceived as a power law. We extend our results to filter patterns of various sizes and demonstrate that the shortest filter pattern provides the maximum informative representations of the inputs. Comment: 17 pages, 8 figures, 1 table, Supplementary Material |
Databáze: | OpenAIRE |
Externí odkaz: |