Combining SMOTE Sampling and Machine Learning for Forecasting Wheat Yields in France
Autor: | François Alin, Michaël Krajecki, Amine Chemchem |
---|---|
Přispěvatelé: | Centre de Recherche en Sciences et Technologies de l'Information et de la Communication - EA 3804 (CRESTIC), Université de Reims Champagne-Ardenne (URCA) |
Rok vydání: | 2019 |
Předmět: |
Computer science
Imbalanced Learning 02 engineering and technology [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] Supervised Classification Machine learning computer.software_genre [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] Set (abstract data type) Crop Smart Agriculture [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] Knowledge extraction 020204 information systems 0202 electrical engineering electronic engineering information engineering Oversampling 2. Zero hunger Index Terms-Machine Learning [INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] business.industry Sampling (statistics) Sampling Methods Knowledge Discovery Random forest 020201 artificial intelligence & image processing Artificial intelligence business computer |
Zdroj: | AIKE International Conference on Artificial Intelligence and Knowledge Engineering (AIKE) International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 2019, Cagliari, Italy. ⟨10.1109/AIKE.2019.00010⟩ |
DOI: | 10.1109/aike.2019.00010 |
Popis: | International audience; This paper describes a method of predicting wheat yields based on machine learning, which accurately determines the value of wheat yield losses in France. Obtaining reliable value from yield losses is difficult because we are tackling a highly unbalanced classification problem. As part of this study, we propose applying the Synthetic Minor Oversampling technique (SMOTE) as a pretreatment step before applying machine learning methods. The approach proposed here improves the accuracy of learning and allows better results on the set of tests by measuring the operating characteristic of the ROC receiver. The comparative study shows that the best result obtained is 90.07% on the set of tests, obtained by hybridizing the SMOTE algorithm with the Random Forest algorithm. The results obtained in this study for wheat yield can be extended to many other crops such as maize, barley, ... |
Databáze: | OpenAIRE |
Externí odkaz: |