Combining SMOTE Sampling and Machine Learning for Forecasting Wheat Yields in France

Autor: François Alin, Michaël Krajecki, Amine Chemchem
Přispěvatelé: Centre de Recherche en Sciences et Technologies de l'Information et de la Communication - EA 3804 (CRESTIC), Université de Reims Champagne-Ardenne (URCA)
Rok vydání: 2019
Předmět:
Computer science
Imbalanced Learning
02 engineering and technology
[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]
Supervised Classification
Machine learning
computer.software_genre
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Set (abstract data type)
Crop
Smart Agriculture
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
Knowledge extraction
020204 information systems
0202 electrical engineering
electronic engineering
information engineering

Oversampling
2. Zero hunger
Index Terms-Machine Learning
[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]
business.industry
Sampling (statistics)
Sampling Methods
Knowledge Discovery
Random forest
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
Zdroj: AIKE
International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)
International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 2019, Cagliari, Italy. ⟨10.1109/AIKE.2019.00010⟩
DOI: 10.1109/aike.2019.00010
Popis: International audience; This paper describes a method of predicting wheat yields based on machine learning, which accurately determines the value of wheat yield losses in France. Obtaining reliable value from yield losses is difficult because we are tackling a highly unbalanced classification problem. As part of this study, we propose applying the Synthetic Minor Oversampling technique (SMOTE) as a pretreatment step before applying machine learning methods. The approach proposed here improves the accuracy of learning and allows better results on the set of tests by measuring the operating characteristic of the ROC receiver. The comparative study shows that the best result obtained is 90.07% on the set of tests, obtained by hybridizing the SMOTE algorithm with the Random Forest algorithm. The results obtained in this study for wheat yield can be extended to many other crops such as maize, barley, ...
Databáze: OpenAIRE