Optimizing Natural Language Processing Pipelines: Opinion Mining Case Study
Autor: | Yudivián Almeida-Cruz, Andrés Montoyo, Yoan Gutiérrez, Suilan Estevez-Velarde |
---|---|
Rok vydání: | 2019 |
Předmět: |
education.field_of_study
Source code Optimization problem business.industry Process (engineering) Computer science media_common.quotation_subject Population Sentiment analysis Probabilistic logic Statistical model 02 engineering and technology computer.software_genre 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business education Metaheuristic computer Natural language processing media_common |
Zdroj: | Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications ISBN: 9783030339036 CIARP |
DOI: | 10.1007/978-3-030-33904-3_15 |
Popis: | This research presents NLP-Opt, an Auto-ML technique for optimizing pipelines of machine learning algorithms that can be applied to different Natural Language Processing tasks. The process of selecting the algorithms and their parameters is modelled as an optimization problem and a technique was proposed to find an optimal combination based on the metaheuristic Population-Based Incremental Learning (PBIL). For validation purposes, this approach is applied to a standard opinion mining problem. NLP-Opt effectively optimizes the algorithms and parameters of pipelines. Additionally, NLP-Opt outputs probabilistic information about the optimization process, revealing the most relevant components of pipelines. The proposed technique can be applied to different Natural Language Processing problems, and the information provided by NLP-Opt can be used by researchers to gain insights on the characteristics of the best-performing pipelines. The source code is made available for other researchers. In contrast with other Auto-ML approaches, NLP-Opt provides a flexible mechanism for designing generic pipelines that can be applied to NLP problems. Furthermore, the use of the probabilistic model provides a more comprehensive approach to the Auto-ML problem that enriches researcher understanding of the possible solutions. |
Databáze: | OpenAIRE |
Externí odkaz: |