Semi-Supervised Iterative Approach for Domain-Specific Complaint Detection in Social Media

Autor: Rakesh Gosangi, Akash Kumar Gautam, Rajiv Ratn Shah, Debanjan Mahata
Rok vydání: 2020
Předmět:
Zdroj: Proceedings of The 3rd Workshop on e-Commerce and NLP.
Popis: In this paper, we present a semi-supervised bootstrapping approach to detect product or service related complaints in social media. Our approach begins with a small collection of annotated samples which are used to identify a preliminary set of linguistic indicators pertinent to complaints. These indicators are then used to expand the dataset. The expanded dataset is again used to extract more indicators. This process is applied for several iterations until we can no longer find any new indicators. We evaluated this approach on a Twitter corpus specifically to detect complaints about transportation services. We started with an annotated set of 326 samples of transportation complaints, and after four iterations of the approach, we collected 2,840 indicators and over 3,700 tweets. We annotated a random sample of 700 tweets from the final dataset and observed that nearly half the samples were actual transportation complaints. Lastly, we also studied how different features based on semantics, orthographic properties, and sentiment contribute towards the prediction of complaints.
Databáze: OpenAIRE