Detecting opinion spams through supervised boosting approach.

Autor: Hazim M; Department of Computer System and Technology, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia., Anuar NB; Department of Computer System and Technology, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia., Ab Razak MF; Department of Computer System and Technology, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia.; Faculty of Computer Systems & Software Engineering, University Malaysia Pahang, Lebuhraya Tun Razak, Gambang, Kuantan, Pahang, Malaysia., Abdullah NA; Department of Computer System and Technology, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2018 Jun 11; Vol. 13 (6), pp. e0198884. Date of Electronic Publication: 2018 Jun 11 (Print Publication: 2018).
DOI: 10.1371/journal.pone.0198884
Abstrakt: Product reviews are the individual's opinions, judgement or belief about a certain product or service provided by certain companies. Such reviews serve as guides for these companies to plan and monitor their business ventures in terms of increasing productivity or enhancing their product/service qualities. Product reviews can also increase business profits by convincing future customers about the products which they have interest in. In the mobile application marketplace such as Google Playstore, reviews and star ratings are used as indicators of the application quality. However, among all these reviews, hereby also known as opinions, spams also exist, to disrupt the online business balance. Previous studies used the time series and neural network approach (which require a lot of computational power) to detect these opinion spams. However, the detection performance can be restricted in terms of accuracy because the approach focusses on basic, discrete and document level features only thereby, projecting little statistical relationships. Aiming to improve the detection of opinion spams in mobile application marketplace, this study proposes using statistical based features that are modelled through the supervised boosting approach such as the Extreme Gradient Boost (XGBoost) and the Generalized Boosted Regression Model (GBM) to evaluate two multilingual datasets (i.e. English and Malay language). From the evaluation done, it was found that the XGBoost is most suitable for detecting opinion spams in the English dataset while the GBM Gaussian is most suitable for the Malay dataset. The comparative analysis also indicates that the implementation of the proposed statistical based features had achieved a detection accuracy rate of 87.43 per cent on the English dataset and 86.13 per cent on the Malay dataset.
Competing Interests: The authors have declared that no competing interests exist.
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje