In the algorithms, we calculate the auc with:


How do we get yTEST?

If we make the yTEST, yTRAIN, yVAL, and YTRAINbig
into factors of the lead dependent variable,
yTEST will not have any values.

Here is an example with three days:

Day | DV | DV_Lead
1 | 0 | 1
2 | 1 | 0
3 | 0 | NA

In this case,
yTRAIN = 1
yVAL = 0

Answers and follow-up questions

Answer or follow-up question 1

Dear student,

That last row is what you would use to make a prediction and submit to the website.
It is normal that DV_lead for that lines is NA because you do not know what happens the next day.

The line(s) immediately before the last line could be used as yTEST.
The lines before yTEST lines are yVAL lines.
The lines before yVAL lines are yTRAIN lines.

Michel Ballings

