Binary classification of pumpkin (Cucurbita pepo L.) seeds based on quality features using machine learning algorithms.

Autor: Çetin, Necati, Ropelewska, Ewa, Fidan, Sali, Ülkücü, Şükrü, Saban, Pembe, Günaydın, Seda, Ünlükara, Ali
Předmět:
Zdroj: European Food Research & Technology; Feb2024, Vol. 250 Issue 2, p409-423, 15p
Abstrakt: Mass, size, and shape attributes are important for the design of planters, breeding studies, and quality assessment. In recent years, machinery design and system development studies have taken these factors into consideration. The aim of this study is to explore classification models for four pumpkin seed varieties according to their physical characteristics by machine learning. Binary classification is important because it ensures that the quality characteristics of the seeds are very similar to each other. The pumpkin seed varieties of Develi, Sena Hanım, Türkmen, and Mertbey were discriminated in pairs. Five machine learning algorithms (Naïve Bayes, NB; support vector machine, SVM; random forest, RF; multilayer perceptron, MLP; and kNN, k-nearest neighbors) were applied to assess the classification performance. In all pairs, the pumpkin seed varieties of Develi and Mertbey were discriminated with the highest accuracies of 85.00% for the MLP model and 84.50% for the SVM model and 83.50% for the RF. In the MLP algorithm, TP rate reached to 0.790 for Develi and 0.910 for Mertbey, Precision to 0.898 for Develi and 0.813 for Mertbey, F-measure to 0.840 for Develi and 0.858 for Mertbey, PRC area to 0.894 for Develi and 0.896 for Mertbey, and ROC area to 0.907 for both varieties. Variety of pairs was followed by Sena Hanım and Türkmen (84.50%, MLP) and Türkmen and Mertbey (82.50%, SVM). For the selected input attributes, the highest mass (0.23 g), length (22.08 for Mertbey, 21.43 for Sena Hanım), and geometric mean diameter (8.79 mm) values were obtained from Sena Hanım variety, while shape index (3.40) from Mertbey variety. Multivariate statistical results showed that differences in attributes were significant (p < 0.01). Wilks' lambda statistics found that the portion of the unexplained difference between groups was 46.60%. Develi and Sena Hanım varieties with the lowest Mahalanobis distance values had similar characteristics. Present results revealed that SVM and MLP may be used effectively and objectively for the classification of pumpkin seed varieties. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index