Optimizing performance of sentiment analysis through design of experiments
Autor: | Gary S. W. Goh, Andy J. L. Ang, Allan N. Zhang |
---|---|
Rok vydání: | 2016 |
Předmět: |
Process (engineering)
business.industry Computer science Design of experiments 0206 medical engineering Sentiment analysis Process design Feature selection 02 engineering and technology Machine learning computer.software_genre Variety (cybernetics) Text mining 0202 electrical engineering electronic engineering information engineering Design process 020201 artificial intelligence & image processing Algorithm design Data mining Artificial intelligence business computer 020602 bioinformatics |
Zdroj: | IEEE BigData |
DOI: | 10.1109/bigdata.2016.7841042 |
Popis: | Traditional manual design of analytical processes is challenging as it requires a general analyst to have good grasping of numerous algorithms and the interaction effects between each technique and the data across multiple domains. Especially in an increasingly high data variety/multi-domain environment today, this design process can be very laborious/challenging. In this paper, we describe a design optimization approach using design of experiments to determine a suitable design in a standardized text classification process with high classification performance. We focus on sentiment analysis as a use case for this approach, as standard analytical methods in each phase of the sentiment analysis process have been established; from data pre-processing, feature selection and classification. In our proposed approach, we present an automatic and domain-free technique of using design of experiments to this design process, with the sentiment classification evaluation metrics as the performance criteria for optimization. In addition, we show that several interpretable analyses can be made to better understand the complex interaction effects of various analytical techniques with the data, which then can guide a general analyst to select more appropriate process design parameters for better text classification performance. |
Databáze: | OpenAIRE |
Externí odkaz: |