Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model

Autor:	Eysha Saad, Ramish Jamil, Furqan Rustam, Gyu Sang Choi, Imran Ashraf, Arif Mehmood
Jazyk:	angličtina
Rok vydání:	2021
Předmět:	General Computer Science Computer science media_common.quotation_subject Feature extraction Data Mining and Machine Learning Decision tree Machine learning computer.software_genre Convolutional neural network Social media Sarcasm detection Artificial Intelligence Classifier (linguistics) media_common Sarcasm business.industry Long short term memory network Data Science QA75.5-76.95 Random forest Natural Language and Speech Multi-domain sarcastic comments Tree (data structure) Bag-of-words model Electronic computers. Computer science Convolutional neural networks Artificial intelligence business computer
Zdroj:	PeerJ Computer Science, Vol 7, p e645 (2021) PeerJ Computer Science
ISSN:	2376-5992
Popis:	Sarcasm emerges as a common phenomenon across social networking sites because people express their negative thoughts, hatred and opinions using positive vocabulary which makes it a challenging task to detect sarcasm. Although various studies have investigated the sarcasm detection on baseline datasets, this work is the first to detect sarcasm from a multi-domain dataset that is constructed by combining Twitter and News Headlines datasets. This study proposes a hybrid approach where the convolutional neural networks (CNN) are used for feature extraction while the long short-term memory (LSTM) is trained and tested on those features. For performance analysis, several machine learning algorithms such as random forest, support vector classifier, extra tree classifier and decision tree are used. The performance of both the proposed model and machine learning algorithms is analyzed using the term frequency-inverse document frequency, bag of words approach, and global vectors for word representations. Experimental results indicate that the proposed model surpasses the performance of the traditional machine learning algorithms with an accuracy of 91.60%. Several state-of-the-art approaches for sarcasm detection are compared with the proposed model and results suggest that the proposed model outperforms these approaches concerning the precision, recall and F1 scores. The proposed model is accurate, robust, and performs sarcasm detection on a multi-domain dataset.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::412b910860b68b21a526e4ef5d7e3ede https://peerj.com/articles/cs-645/ Zobrazit plný text záznamu