0 Members and 1 Guest are viewing this topic.

quote:The critical issue in developing a neural network is generalization: howwell will the network make predictions for cases that are not in thetraining set? NNs, like other flexible nonlinear estimation methods such askernel regression and smoothing splines, can suffer from either underfittingor overfitting. A network that is not sufficiently complex can fail todetect fully the signal in a complicated data set, leading to underfitting.A network that is too complex may fit the noise, not just the signal,leading to overfitting. Overfitting is especially dangerous because it caneasily lead to predictions that are far beyond the range of the trainingdata with many of the common types of NNs. Overfitting can also produce wildpredictions in multilayer perceptrons even with noise-free data. For an elementary discussion of overfitting, see Smith (1996). For a morerigorous approach, see the article by Geman, Bienenstock, and Doursat (1992)on the bias/variance trade-off (it's not really a dilemma). We are talkingabout statistical bias here: the difference between the average value of anestimator and the correct value. Underfitting produces excessive bias in theoutputs, whereas overfitting produces excessive variance. There aregraphical examples of overfitting and underfitting in Sarle (1995, 1999). The best way to avoid overfitting is to use lots of training data. If youhave at least 30 times as many training cases as there are weights in thenetwork, you are unlikely to suffer from much overfitting, although you mayget some slight overfitting no matter how large the training set is. Fornoise-free data, 5 times as many training cases as weights may besufficient. But you can't arbitrarily reduce the number of weights for fearof underfitting.