Background: A general, restrictions-free theory on performance of arbitrary artificial learners has not been developed yet. Empirically, not much research has been per-formed on the question of an appropriate description of artificial learner's performance.
Objective: The objective of this paper is to find out which mathematical description fits best learning curves produced by a neural network classification algorithm.
Methods: A Weka-based multilayer perceptron (MLP) neural network classification algorithm was applied to a set of datasets (n=109) from publicly available repositories (UCI) in step wise k-fold cross-validation and an error rate was measured in each step. First, four different functions, i.e. power, linear, logarithmic, exponential, were fit to the measured error rates. Where the fit was statistically significant (n=69), we measured the average mean squared error rate for each function and its rank. The dependent samples T-test was performed to test whether the differences between mean squared error rates are significantly different from each other, and Wilcoxon's signed rank test was used to test whether the differences between ranks are significant.
Results: The error rates, induced by a neural network, were best modeled by an exponential function. In a total of 69 datasets, exponential function was a better descriptor of error rate function in 60 of 69 cases, power was best in 8, logarithmic in 1, and linear in none out of 69 cases. Average mean squared error across all datasets was 0,000365 for exponential function, and was significantly different at P=0,002 from power, at P=0,000 from linear and at P=0,001 from logarithmic function. The exponential function's rank is, using Wilcoxon's test, significantly different at any reasonable threshold (P=0,000) from the rank of any other model.
Conclusion: In the area of human cognitive performance the exponential function was found to be the best fit for a description of an individual learner. In the area of artificial learners, specifically the multilayer perceptron, our findings are consistent with the mentioned. Our work can be used to forecast and model the future performance of a MLP neural network when not all data have been used or there is a need to obtain more data for better accuracy.