Creating identical percentages of 1's and 0's efficiently

How would we make sure that the percentage of 1s (and also 0s) in a starting data frame be identical to a training and test set?

Answers and follow-up questions

Answer or follow-up question 1

Dear student,

Instead of using sample() once on the entire data set, what you do is use sample() twice: once on the 0s and once on the 1s, and then
combine the results. This is called stratified sampling.

Michel Ballings

Sign in to be able to add an answer or mark this question as resolved.