Leading variable

In an answer to a question titled "Predicting using Naive Bayes,"
you said "If you use data of Jan 2, the prediction will be for Jan 3 (provided that you have a leading variable in your basetable)."

Are leading variables required for predictions on the stock data?

In an answer a question titled "Lead variable for predictions,"
you gave a hint on how to fix code to add a leading variable to the stock data.

You said:

I won't give you the solution, as this is part of the assignment, but I will give you a strong hint.

This part is incorrect:


What you a doing is:
Take all values of DV by Symbol. Per Symbol, apply the following function:

That function is adding a leading NA to a vector of length 1 containing only its last element (i.e., you select
the element at the last position of the vector).

This is clearly not what you want.

I still cannot figure out how to link your hint to a solution
and I have not been able to find anything in the book about generating a leading variable.
Given your response in the first question,
I think that generating an effective leading variable is critical for success on this project.

My best solution would be to create a Lead variable like this to each individual stock:

Symbol Date ... DV Lead
A 99/12/31 NA 0
A 00/01/01 0 1
A 00/01/02 1 0 or 1 (based on DV of 00/01/03)
... ... ... ... ...
A 00/12/31 1 0
A 01/01/01 0 1
A 01/01/02 1 NA

Is this what you are looking for?

Also, can you add something to the book about creating a leading variable?

Answers and follow-up questions

Answer or follow-up question 1

I figured it out.

Sign in to be able to add an answer or mark this question as resolved.