Approaches to the classification of complex systems: Words, texts, and more

Autor: Rovenchak, Andrij
Rok vydání: 2022
Předmět:
Zdroj: in: Order, Disorder and Criticality: Advanced Problems of Phase Transition Theory, Vol. 7, edited by Yu. Holovatch, 209-246 (Singapore; River Ridge, NJ: World Scientific, 2023)
Druh dokumentu: Working Paper
DOI: 10.1142/9789811260438_0005
Popis: The Chapter starts with introductory information about quantitative linguistics notions, like rank--frequency dependence, Zipf's law, frequency spectra, etc. Similarities in distributions of words in texts with level occupation in quantum ensembles hint at a superficial analogy with statistical physics. This enables one to define various parameters for texts based on this physical analogy, including "temperature", "chemical potential", entropy, and some others. Such parameters provide a set of variables to classify texts serving as an example of complex systems. Moreover, texts are perhaps the easiest complex systems to collect and analyze. Similar approaches can be developed to study, for instance, genomes due to well-known linguistic analogies. We consider a couple of approaches to define nucleotide sequences in mitochondrial DNAs and viral RNAs and demonstrate their possible application as an auxiliary tool for comparative analysis of genomes. Finally, we discuss entropy as one of the parameters, which can be easily computed from rank--frequency dependences. Being a discriminating parameter in some problems of classification of complex systems, entropy can be given a proper interpretation only in a limited class of problems. Its overall role and significance remain an open issue so far.
Comment: Chapter submitted to the book: Order, Disorder and Criticality: Advanced Problems of Phase Transition Theory. Ed. by Yu. Holovatch. Vol. 7, 2022, World Scientific, Singapore
Databáze: arXiv