Predictive modeling for wine authenticity using a machine learning approach

Autor: Nattane Luíza da Costa, Leonardo A. Valentin, Inar Alves Castro, Rommel Melgaço Barbosa
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Artificial Intelligence in Agriculture, Vol 5, Iss , Pp 157-162 (2021)
Druh dokumentu: article
ISSN: 2589-7217
DOI: 10.1016/j.aiia.2021.07.001
Popis: The purpose of this paper is to classify wines from 4 different countries in South America. Each class of wines is formed by samples considered by experts as representatives of the following commercial categories: “Argentinean Malbec (AM)”, “Brazilian Merlot (BM)”, “Uruguayan Tannat (UT)” and “Chilean Carménère (CC)”. The 83 samples collected were analyzed according to their composition of volatiles, semi-volatiles and phenolic compounds. We built a decision system for classification based on support vector machines (SVM), along with Correlation-based Feature selection (CFS), and Random Forest Importance (RFI), which measures the relative importance of the input variables. First, we use CFS to select a subset of variables among 190 chemical compounds. Thirteen chemicals were selected as correlated to the category and uncorrelated with each other. Afterwards, these chemical compounds were organized according to the importance ranking given by the RFI and classified with SVM. The study clearly indicated that SVM in combination with feature selection methods was able to identify the most important chemicals to classify the wine samples. Among the compounds identified in the wine samples, the variable subset defined by the feature selection methods, which were catechin, gallic, octanoic acid, myricetin, caffeic, isobutanol, resveratrol, kaempferol, and ORAC, were able to achieve an accuracy of 93.97% in classifying the commercial categories.
Databáze: Directory of Open Access Journals