Question



Defining Validation data vs training data

I looked at previous questions on defining validation data and training data. Your response stated training data as very far past
and validation as far past. Are there distinct definitions of how you decipher between very far past and far past. Do we take
our data and divide it in two based on the time? Am I approaching this correctly?





Answers and follow-up questions





Answer or follow-up question 1

Dear student,

There are no strict rules. The best approach is to take the last but one day in your data as test data.
All the days before that are training (very far past) and validation data (far past). You could use 50% of the days for training and 50% for
validation,
or go with a 70%/30% split. I prefer a 50/50 split as tuning (validation) is just as crucial as training.

Michel Ballings



Sign in to be able to add an answer or mark this question as resolved.