[artificial-intelligence] What's is the difference between train, validation and test set, in neural networks?

Say you train a model on a training set and then measure its performance on a test set. You think that there is still room for improvement and you try tweaking the hyper-parameters ( If the model is a Neural Network - hyper-parameters are the number of layers, or nodes in the layers ). Now you get a slightly better performance. However, when the model is subjected to another data ( not in the testing and training set ) you may not get the same level of accuracy. This is because you introduced some bias while tweaking the hyper-parameters to get better accuracy on the testing set. You basically have adapted the model and hyper-parameters to produce the best model for that particular training set.

A common solution is to split the training set further to create a validation set. Now you have

  • training set
  • testing set
  • validation set

You proceed as before but this time you use the validation set to test the performance and tweak the hyper-parameters. More specifically, you train multiple models with various hyper-parameters on the reduced training set (i.e., the full training set minus the validation set), and you select the model that performs best on the validation set.

Once you've selected the best performing model on the validation set, you train the best model on the full training set (including the valida- tion set), and this gives you the final model.

Lastly, you evaluate this final model on the test set to get an estimate of the generalization error.