Question



Text mining after SVD

What are the next steps after we run an SVD on our text data to predict stock data?

The code says I can predict using this line of code
predstories <- reviews_mat %*% s$v %*% solve(diag(s$d))

I also know I only need to use the first 10 columns of the SVD to get explain most of the words. Do I just use the first 10 columns of s$v?

The result of predstories is a 160 x 160 matrix, how do i use this to predict my stocks?





Answers and follow-up questions





Answer or follow-up question 1

Dear student,

The singular vectors (i.e., concepts) are additional predictors. Just merge them with your stock data.

Just to be clear what all the steps are:
1) make a dtm on your stories
2) aggregate them by stock and by day
3) do svd and select subsect of singular vectors
4) merge them with your stock data by stock and day

Michel Ballings



Sign in to be able to add an answer or mark this question as resolved.