Prediction of milk quantitative traits based on infrared spectroscopy using machine-learning methods

Autor: Leonid Legashev, Lyubov Grishina, Alexander Sermyagin, Irina Bolodurina
Rok vydání: 2022
Předmět:
Zdroj: Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control & Radioelectronics. 22:47-56
ISSN: 2409-6571
1991-976X
DOI: 10.14529/ctcr220305
Popis: Fourier transform mid-infrared spectroscopy is a fast and cheap way to analyze cow's milk samples to determine fat, protein, lactose and other quantitative and qualitative indicators of milk quality. Modern tools for data analysis will reveal the relationship between different pairs of quantitative and qualitative characteristics of milk. Purpose of the study. Perform predictions on some key milk quality traits based on infrared spectroscopy data to study the accuracy of the developed mathematical model. Methods. The work was carried out in the winter period of 2022 on the basis of an experimental herd of Holsteinized black-and-white cattle (Krasnodar Territory). The analysis of milk traits was carried out with an automatic analyzer MilkoScan (FOSS) using the method of infrared spectroscopy by unloading the obtained spectra when analyzing the composition of raw milk. 23 indicators of the quantitative milk traits were studied: mass fraction of fat, protein (true and total), lactose, DSMR (dry skimmed milk residue), dry matter, casein, traces of acetone and beta-hydroxybutyrate, urea, freezing point, acidity of milk, myristic, palmitic, stearic, oleic fatty acids (FA), long-chain fatty acids, medium-chain fatty acids, short-chain fatty acids, monounsaturated and polyunsaturated fatty acids, saturated fatty acids, trans fatty acids. Methods based on linear regression, approaches to the regularization of the linear regression model (Ridge, Lasso and ElasticNet), as well as polynomial regression, the partial regression method (PLSRegression) and the Bayesian regression method for the problem of predicting key features of milk traits were considered. A method for reducing the dimensionality of infrared spectroscopy data is implemented based on the algorithm of random search of readings along the length of the window, and the most significant features are identified. Results. Models have been developed for predicting six main indicators of milk quality – mass fraction of fat ('Fat'), mass fraction of casein ('Cas.B'), fatty acids – myristic ('C14:0') and oleic ('C18: 1'), monounsaturated ('MUFA') and polyunsaturated fatty acids ('PUFA') – with an average absolute error not exceeding 0,016. Conclusion. The results obtained in the course of the study will further improve the predictive ability of the equation for determining the quality and composition of milk according to new breeding traits of milk productivity, reduce analysis costs and monitor the health of animals at an early stage.
Databáze: OpenAIRE