Think Outside the Dataset
Autor: | Christopher Kruegel, Hojjat Aghakhani, Giovanni Vigna, Eric Gustafson, Shirin Nilizadeh |
---|---|
Rok vydání: | 2019 |
Předmět: |
010104 statistics & probability
Computer science media_common.quotation_subject 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing 02 engineering and technology 0101 mathematics Set (psychology) 01 natural sciences Data science Reputation media_common |
Zdroj: | WWW |
DOI: | 10.1145/3308558.3313647 |
Popis: | While online review services provide a two-way conversation between brands and consumers, malicious actors, including misbehaving businesses, have an equal opportunity to distort the reviews for their own gains. We propose OneReview, a method for locating fraudulent reviews, correlating data from multiple crowd-sourced review sites. Our approach utilizes Change Point Analysis to locate points at which a business' reputation shifts. Inconsistent trends in reviews of the same businesses across multiple websites are used to identify suspicious reviews. We then extract an extensive set of textual and contextual features from these suspicious reviews and employ supervised machine learning to detect fraudulent reviews. We evaluated OneReview on about 805K and 462K reviews from Yelp and TripAdvisor, respectively to identify fraud on Yelp. Supervised machine learning yields excellent results, with 97% accuracy. We applied the created model on suspicious reviews and detected about 62K fraudulent reviews (about 8% of all the Yelp reviews). We further analyzed the detected fraudulent reviews and their authors, and located several spam campaigns in the wild, including campaigns against specific businesses, as well as campaigns consisting of several hundreds of socially-networked untrustworthy accounts. |
Databáze: | OpenAIRE |
Externí odkaz: |