Question



Not enough memory to run RF on the Stocks data

Hello, I can run the RF script all the way until the very last RF involving the big Base table.

If ntrees ~> 450, my machine quits and declares there isnt enough memory. However, when I set ntrees to 400, its able to run with only this
warning:

Warning messages:
1: In matrix(rfout$nodepred, ncol = ntree) :
Reached total allocation of 8095Mb: see help(memory.size)
2: In matrix(rfout$nodepred, ncol = ntree) :
Reached total allocation of 8095Mb: see help(memory.size)

Are there any workarounds? Would alterning the node size work?





Answers and follow-up questions





Answer or follow-up question 1

Dear student,

Use either less trees, or less data to use less memory.

Another (more involved) way is to build and use the trees one by one, each time storing the on your disk.
Because you are doing it sequentially your memory footprint at each point in time will go down from ntree to 1 tree

Michel Ballings


Answer or follow-up question 2

"Another (more involved) way is to build and use the trees one by one, each time storing the on your disk.
Because you are doing it sequentially your memory footprint at each point in time will go down from ntree to 1 tree"

Do you have any suggestions on where I could find information on how to do this?


Answer or follow-up question 3

Dear student,

I do not but this should get you started:

Training:

for (a given number of trees)
-use the ntree parameter to build 1 (or more) tree(s)
-use save() to store it on disk
-use rm() to remove it from your working environment


Deployment:

for (a given number of trees)
-use load() to load the tree in your environment
-use predict.randomForest() to make a prediction
-use rm() to remove the tree from your environment
-store each prediction (it will be a binary prediction)

At the end, take the proportion of 1s as your final prediction.

Michel Ballings



Sign in to be able to add an answer or mark this question as resolved.