Linguistic Laws in Speech: The Case of Catalan and Spanish

Autor: Iván González Torre, Juan María Garrido, Antoni Hernández-Fernández, Lucas Lacasa
Přispěvatelé: Universitat Politècnica de Catalunya. Institut de Ciències de l'Educació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge
Rok vydání: 2019
Předmět:
050101 languages & linguistics
Speech production
Brevity law
Computer science
speech
media_common.quotation_subject
General Physics and Astronomy
02 engineering and technology
Glissando corpus
Measure (mathematics)
Article
Scaling
Herdan’s law
0202 electrical engineering
electronic engineering
information engineering

Speech
0501 psychology and cognitive sciences
Complement (set theory)
media_common
Zipf's law
Lingüística quantitativa
scaling
05 social sciences
lognormal distribution
Quantitative linguistics
Agreement
Linguistics
language.human_language
size-rank law
Size-rank law
Zipf’s law
Law
language
Menzerath–Altmann’s law
quantitative linguistics
020201 artificial intelligence & image processing
Catalan
Informàtica::Intel·ligència artificial::Llenguatge natural [Àrees temàtiques de la UPC]
Zdroj: Recercat. Dipósit de la Recerca de Catalunya
instname
Entropy
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Volume 21
Issue 12
ISSN: 1099-4300
DOI: 10.3390/e21121153
Popis: In this work we consider Glissando Corpus&mdash
an oral corpus of Catalan and Spanish&mdash
and empirically analyze the presence of the four classical linguistic laws (Zipf&rsquo
s law, Herdan&rsquo
s law, Brevity law, and Menzerath&ndash
Altmann&rsquo
s law) in oral communication, and further complement this with the analysis of two recently formulated laws: lognormality law and size-rank law. By aligning the acoustic signal of speech production with the speech transcriptions, we are able to measure and compare the agreement of each of these laws when measured in both physical and symbolic units. Our results show that these six laws are recovered in both languages but considerably more emphatically so when these are examined in physical units, hence reinforcing the so-called `physical hypothesis&rsquo
according to which linguistic laws might indeed have a physical origin and the patterns recovered in written texts would, therefore, be just a byproduct of the regularities already present in the acoustic signals of oral communication.
Databáze: OpenAIRE
Nepřihlášeným uživatelům se plný text nezobrazuje