Assessment of Long Short-Term Memory Network for Quora Sentiment Analysis
Autor: | Vaibhav Kumar Seth, B. S. Prithvi, H. S. Sanjay, Subojit Mohanty |
---|---|
Rok vydání: | 2021 |
Předmět: |
Stop words
General Computer Science Computer science business.industry Lemmatisation Sentiment analysis Filter (signal processing) computer.software_genre Tokenization (data security) Preprocessor Word2vec Artificial intelligence Electrical and Electronic Engineering business computer Natural language processing Word (computer architecture) |
Zdroj: | Journal of The Institution of Engineers (India): Series B. 103:375-384 |
ISSN: | 2250-2114 2250-2106 |
Popis: | Quora is a common online platform for users to obtain answers to their questions, which is often subjected to the posting of irrelevant questions and answers by the users. This needs to be filtered and only relevant contents are to be allowed. The present work emphasized on filtering of community-oriented irrelevant questions by machine learning and classifying the questions as genuine/irrelevant. Preprocessing techniques like tokenization, lemmatization and stop words removal were used to convert the input data into a structured data, which was then fed to Word2Vec embedding model which mapped every unique word to a corresponding vector in the space. This vector model was given as an input to long short-term memory (LSTM) network which used ReLU and Adam for optimization. The accuracy of the model was verified at the end of every epoch, and the training was halted with the reduction of accuracy. An accuracy of 97% was obtained at the end of the final epoch. Such an approach was able to successfully filter the community-oriented irrelevant questions. This could be further extended for other anti-social aspects, in the near future. |
Databáze: | OpenAIRE |
Externí odkaz: |