predKNN Error: "Error in matrix(data = predKNN, ncol = k, nrow = nrow(testKNN)) : non-numeric matrix extent"


I am currently attempting to apply KNN to the NYSE data. I followed the code in the textbook all the way to the line before calling the
AUC. I am having issues with the following code:

predKNN <- rowMeans(data.frame(matrix(data=predKNN,ncol=k,nrow=nrow(testKNN))))

Error Message:
Error in matrix(data = predKNN, ncol = k, nrow = nrow(testKNN)) : non-numeric matrix extent

Structure of predKNN and testKNN
> str(predKNN)
int [1:6963420] 1 0 1 1 0 1 0 1 0 1 ...
> str(testKNN)
num [1:696342] -0.033 -0.0351 -0.0335 -0.035 -0.033 ...


Answers and follow-up questions

Answer or follow-up question 1

Dear student,

The problem is that testKNN is a vector.
nrow(testKNN) will return NULL.
Change it to length(testKNN) and the error will go away.

Michel Ballings

Answer or follow-up question 2


I've changed length and now the AUC gives an error

> predKNN <- rowMeans(data.frame(matrix(data=predKNN,
+ ncol=k,
+ nrow=length(testKNN))))
> auc(roc(predKNN,yTEST))

Error in roc(predKNN, yTEST) :
Not enough distinct predictions to compute area under the ROC curve.

> str(predKNN)
num [1:696342] 0.7 0.8 0.4 0.6 0.6 0.5 0.6 0.4 0.6 0.6 ...
> str(yTEST)
num [1:116057] 1.007 1 0.99 0.991 1.087 ...

Do you have any suggestions?

Answer or follow-up question 3

This may be the reason why

> nrow(predKNN)
> nrow(yTEST)

Answer or follow-up question 4

> length(predKNN)
[1] 696342
> length(yTEST)
[1] 116057

Answer or follow-up question 5

Dear student,

The roc function will complain in the following cases:
1)yTEST is not a factor
2)predKNN and yTEST are not of the same length
3)any of those two has NAs

From what I can tell, you have problems with (1) and (2).

Michel Ballings

Answer or follow-up question 6

I understand how to make yTEST a factor. But do you have any suggestions to make predKNN and yTEST the same length?

predKNN, testKNN, trainKNN, valKNN are all length=696342
yVAL, yTRAIN, yTEST, valind, trainind are all length=116057

Answer or follow-up question 7

Dear student,

I cannot help you without seeing your code.

predKNN, testKNN, yTEST should be of the same length.
trainKNN, yTRAIN, trainind should be of same length.
valKNN, yVAL, valind should also be of same length.

Michel Ballings

Sign in to be able to add an answer or mark this question as resolved.