Hadoop framework for efficient sentiment classification using trees
Autor: | S. Daniel Madan Raja, K. Sridharan, G. Komarasamy |
---|---|
Rok vydání: | 2020 |
Předmět: |
opinion words
Control and Optimization Computer Networks and Communications Group method of data handling Computer science Big data Feature extraction TK5101-6720 02 engineering and technology Management Science and Operations Research 01 natural sciences big data 0202 electrical engineering electronic engineering information engineering Feature (machine learning) stipulated time limit Class (computer programming) Information retrieval Application programming interface business.industry 010401 analytical chemistry Sentiment analysis extensive data analysis 020206 networking & telecommunications 0104 chemical sciences Random forest sentiment analysis Telecommunication traditional database software tool business |
Zdroj: | IET Networks, Vol 9, Iss 5, Pp 223-228 (2020) |
ISSN: | 2047-4962 2047-4954 |
DOI: | 10.1049/iet-net.2019.0208 |
Popis: | Due to the increase in the speed of generation of data, the authors are forced to handle a massive volume of data with the help of conventional machine learning algorithms. Big data is an enormous volume of data which is beyond the capacity of the traditional database software tool to collect, store, manage, and process within a stipulated time limit. Sentiment analysis is analysing the data by classifying the text on the basis of strength and polarity of opinion (positive/negative) words that define the text. While handling big data, Hadoop provides a platform for users to develop their own sentiment analysis with the help of a lexicon dictionary or available application programming interface (API) or external programs. The aim of classifying data is to analyse extensive data and develop an appropriate description or model for every organised class with the feature present in the data. In this work, the feature extraction based on term frequency‐inverse document frequency is utilised and the Hadoop framework in attaining a useful classification with the help of random forest techniques. |
Databáze: | OpenAIRE |
Externí odkaz: |