Oxymoron: An Automatic Detection from the Corpus.

Autor: Senapati, Apurbalal
Zdroj: Procedia Computer Science; 2024, Vol. 244, p49-56, 8p
Abstrakt: An oxymoron is a linguistic phenomenon in which a pair of opposite or antonymous words are combined to convey a new meaning. Sometimes, it is used to express figurative, irony, or rhetoric within the text. This issue has received relatively less attention in the realms of linguistics and computational disciplines. Oxymorons play a significant role in various language-processing applications. This study represents a pioneering effort in the exploration of oxymorons in the Bengali language. A corpus-based study of oxymoron is a fundamental issue that has not been explored so far. A system has been proposed for the automated recognition of oxymorons from a given corpus. Frequency analysis, semantic similarity, and an antonym dictionary have been employed to discern oxymorons within the corpus. The system achieved promising results when tested on a Bengali corpus, and found 308 distinct oxymorons. A corpus-based descriptive statistics is measured in two different corpora. The most common oxymorons are ranked based on their frequency. Their notable presence underscores the importance of the Bengali language. This study aimed to explore fundamental questions concerning oxymorons, such as the automated detection of oxymorons within a corpus, descriptive statistics regarding oxymorons across languages, and the process of their construction and creation. Additionally, efforts were made to extract oxymorons from large language models using zero-shot prompts, but the results were not as promising compared to our proposed system. [ABSTRACT FROM AUTHOR]
Databáze: Supplemental Index