[machine-learning] Is there a rule-of-thumb for how to divide a dataset into training and validation sets?