Genetic Programming for Symbolic Regression: A Study on Fish Weight Prediction

Autor: Bing Xue, Linley K. Jesson, Yunhan Yang, Mengjie Zhang
Rok vydání: 2021
Předmět:
Zdroj: CEC
Popis: The fish weight is a very important factor in fisheries science and management since it explains the growth and living conditions of fish populations. A power regression model has been commonly used to explain the relationship between the fish length and the weight. In this work, Genetic Programming (GP) for symbolic regression is used to build a new model for predicting the fish weight, which allows us to include more features into the model to discover any hidden relationship, and the GP based symbolic regression makes the model interpretable comparing with other machine learning methods. A publicly available dataset is taken with four species of fish which includes more features than just the fish length that is commonly used in existing models. The proposed GP based symbolic regression method has been examined on those four species. The results are compared with the weight prediction baseline methods including Linear Regression, Power Regression model, k-Nearest Neighbour, Ridge Regression, Decision Tree, Random Forest, Gradient Boosting, and Multilayer Perceptron. GP performs better, or at least as good as the baseline methods on the test set. Furthermore, the generated GP models also can select different features for different species to improve the prediction performance due to GP’s explicit feature selection ability. Some models are interpretable with relatively simple expression. The GP method is also able to find models that are similar to the power regression model, but more features are included rather than a single length feature to gain improved prediction performance.
Databáze: OpenAIRE