Training radial basis function neural networks: effects of training set size and imbalanced training sets

Autor: Lynne Boddy, Luan Al-Haddad, Colin W. Morris
Přispěvatelé: Beaussier, Catherine
Jazyk: angličtina
Rok vydání: 2000
Předmět:
Popis: Obtaining training data for constructing artificial neural networks (ANNs) to identify microbiological taxa is not always easy. Often, only small data sets with different numbers of observations per taxon are available. Here, the effect of both size of the training data set and of an imbalanced number of training patterns for different taxa is investigated using radial basis function ANNs to identify up to 60 species of marine microalgae. The best networks trained to discriminate 20, 40 and 60 species respectively gave overall percentage correct identification of 92, 84 and 77%. From 100 to 200 patterns per species was sufficient in networks trained to discriminate 20, 40 or 60 species. For 40 and 60 species data sets an imbalance in the number of training patterns per species always affected training success, the greater the imbalance the greater the effect. However, this could be largely compensated for by adjusting the networks using a posteriori probabilities, estimated as network output values.
Databáze: OpenAIRE