Effects of Model Structural Complexity and Data Pre-Processing on Artificial Neural Network (ANN) Forecast Performance for Hydrological Process Modelling

Autor:	Mustapha Mohammed, Lydia Ezekiel Pam, John Jiya Musa, Martins Yusuf Otache, Ibrahim Abayomi Kuti
Rok vydání:	2021
Předmět:	Artificial neural network Dynamical systems theory Dimension (vector space) Computer science Range (statistics) Coherence (signal processing) Data pre-processing Layer (object-oriented design) Algorithm Backpropagation
Zdroj:	Open Journal of Modern Hydrology. 11:1-18
ISSN:	2163-0496 2163-0461
DOI:	10.4236/ojmh.2021.111001
Popis:	The choice of a particular Artificial Neural Network (ANN) structure is a seemingly difficult task; worthy of relevance is that there is no systematic way for establishing a suitable architecture. In view of this, the study looked at the effects of ANN structural complexity and data pre-processing regime on its forecast performance. To address this aim, two ANN structural configurations: 1) Single-hidden layer, and 2) Double-hidden layer feed-forward back propagation network were employed. Results obtained revealed generally that: a) ANN comprised of double hidden layers tends to be less robust and converges with less accuracy than its single-hidden layer counterpart under identical situations; b) for a univariate time series, phase-space reconstruction using embedding dimension which is based on dynamical systems theory is an effective way for determining the appropriate number of ANN input neurons, and c) data pre-processing via the scaling approach excessively limits the output range of the transfer function. In specific terms considering extreme flow prediction capability on the basis of effective correlation: Percent maximum and minimum correlation coefficient (Rmax% and Rmin%), on the average for one-day ahead forecast during the training and validation phases respectively for the adopted network structures: 8 7 5 (i.e., 8 input nodes, 7 nodes in the hidden layer, and 5 output nodes in the output layer), 8 5 2 5 (8 nodes in the input layer, 5 nodes in the first hidden layer, 2 nodes in the second hidden layer, and 5 nodes in the output layer), and 8 4 3 5 (8 nodes in the input layer, 4 nodes in the first hidden layer, 3 nodes in the second hidden layer, and 5 nodes in the output layer) gave: 101.2, 99.4; 100.2, 218.3; 93.7, 95.0 in all instances irrespective of the training algorithm (i.e., pooled). On the other hand, in terms of percent of correct event prediction, the respective performances of the models for both low and high flows during the training and validation phases, respectively were: 0.78, 0.96: 0.65, 0.87; 0.76, 0.93: 0.61, 0.83; and 0.79, 0.96: 0.65, 0.87. Thus, it suffices to note that on the basis of coherence or regularity of prediction consistency, the ANN model: 8 4 3 5 performed better. This implies that though the adoption of large hidden layers vis-a-vis corresponding large neuronal signatures could be counter-productive because of network over-fitting, however, it may provide additional representational power. Based on the findings, it is imperative to note that ANN model is by no means a substitute for conceptual watershed modelling, therefore, exogenous variables should be incorporated in streamflow modelling and forecasting exercise because of their hydrologic evolutions.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::245cdd85b769292947de38af6a036988 https://doi.org/10.4236/ojmh.2021.111001 Zobrazit plný text záznamu