Lexical, Pragmatic and Linguistic Feature Based Two-Level Sarcasm Detection Using Machine Learning Techniques

Autor: Paluck Deep, Archita Mittal, Sakshi Agarwal, Easha Pandey
Rok vydání: 2021
Předmět:
Zdroj: Advances in Intelligent Systems and Computing ISBN: 9789811627118
Popis: Sarcasm refers to the use of words that mean different from what a person really wants to say, especially in order to show irritation, or to insult someone or just to be funny. The detection of sarcasm is of great importance to many natural language processing applications such as sentiment analysis, social media handling, advertising and opinion mining. To understand sarcasm, people usually rely on the speaker’s facial expression, tone of voice, writing style, and how the speaker might feel about it. Despite various facial cues, sarcasm detection could be a difficult process and some people struggle to understand sarcasm. The detection of sarcasm becomes twofold difficult in textual context as compared with verbal communication. Along with the presence of sarcasm in text, there are many situations where a person needs to know what kind of sarcasm has been used. Therefore, this work aims to automate the process of the multi-classification of the sarcastic tweets into four different categories—positive, negative, self-deprecating and miscellaneous using three types of features, i.e., lexical, pragmatic and linguistic. For multiclass classification, four classifiers have been used—Logistic Regression, Naive Bayes, Random Forest and Support Vector Machine. The experimental results illustrate that, up to 89 and 39%, accuracy has been achieved with respect to binary and multiclass sarcasm detection, respectively.
Databáze: OpenAIRE