Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model
Autor: | Eysha Saad, Ramish Jamil, Furqan Rustam, Gyu Sang Choi, Imran Ashraf, Arif Mehmood |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
General Computer Science
Computer science media_common.quotation_subject Feature extraction Data Mining and Machine Learning Decision tree Machine learning computer.software_genre Convolutional neural network Social media Sarcasm detection Artificial Intelligence Classifier (linguistics) media_common Sarcasm business.industry Long short term memory network Data Science QA75.5-76.95 Random forest Natural Language and Speech Multi-domain sarcastic comments Tree (data structure) Bag-of-words model Electronic computers. Computer science Convolutional neural networks Artificial intelligence business computer |
Zdroj: | PeerJ Computer Science, Vol 7, p e645 (2021) PeerJ Computer Science |
ISSN: | 2376-5992 |
Popis: | Sarcasm emerges as a common phenomenon across social networking sites because people express their negative thoughts, hatred and opinions using positive vocabulary which makes it a challenging task to detect sarcasm. Although various studies have investigated the sarcasm detection on baseline datasets, this work is the first to detect sarcasm from a multi-domain dataset that is constructed by combining Twitter and News Headlines datasets. This study proposes a hybrid approach where the convolutional neural networks (CNN) are used for feature extraction while the long short-term memory (LSTM) is trained and tested on those features. For performance analysis, several machine learning algorithms such as random forest, support vector classifier, extra tree classifier and decision tree are used. The performance of both the proposed model and machine learning algorithms is analyzed using the term frequency-inverse document frequency, bag of words approach, and global vectors for word representations. Experimental results indicate that the proposed model surpasses the performance of the traditional machine learning algorithms with an accuracy of 91.60%. Several state-of-the-art approaches for sarcasm detection are compared with the proposed model and results suggest that the proposed model outperforms these approaches concerning the precision, recall and F1 scores. The proposed model is accurate, robust, and performs sarcasm detection on a multi-domain dataset. |
Databáze: | OpenAIRE |
Externí odkaz: |