## Question

Should I subtract colMeans from my DTM matrix when I am recreating it in the predict function?

In your code for text mining and creating an SVD, you first "center the data" by subtracting the column means of the reviews_mat before

performing the svd. This is what I did as well in my best algortihm creation function. In the prediction function, when I am recreating

the DTM just for this day's stories, do I still need to subtract the colMeans from the dtm matrix? Wouldn I need to save the column means

from the first matrix and subtract them from this before I use the "reviews_mat %*% s$v %*% solve(diag(s$d))" code to create my variables?

Thanks.

## Answers and follow-up questions

** Answer or follow-up question 1**Dear student,

"Wouldn I need to save the column means

from the first matrix and subtract them from this before I use the "reviews_mat %*% s$v %*% solve(diag(s$d))" code to create my variables?

"

Indeed.

Michel Ballings

Sign in to be able to add an answer or mark this question as resolved.