Michael Green and Mattias Ohlsson
Comparison of standard resampling methods for performance estimation of artificial neural network ensembles
Proceedings of Computational Intelligence in Medicine and Healthcare (CIMED) (2007)

Abstract: Estimation of the generalization performance for classification within the medical applications domain is always an important task. In this study we focus on artificial neural network ensembles as the machine learning technique. We present a numerical comparison between five common resampling techniques: k-fold cross validation (CV), holdout, using three cutoffs, and bootstrap using five different data sets. The results show that CV together with holdout 0.25 and 0.50 are the best resampling strategies for estimating the true performance of ANN ensembles. The bootstrap, using the .632+ rule, is too optimistic, while the holdout 0.75 underestimates the true performance.


LU TP 07-16