Max AUC (bagged trees) Because we are sampling, each time we run the bagged trees code for a specific ensemble size, our AUC changes. How then do we choose an
ensemble size that gives the highest AUC?
Answers and follow-up questions Answer or follow-up question 1
The way you want to approach this is as follows:
Estimation phase: Create a big ensemble with many trees only once
Deployment phase: deploy the ensemble multiple times for k=1..all, with k the ensemble size and store the AUC for each deployment. Start
with the first tree you have built, and end with the last one. Where the AUC maxes out, you have the optimal ensemble size.
If you redo the estimation phase, of course the optimal k will differ, as the model will be different.
Answer or follow-up question 2
What is meant by "Estimation phase: Create a big ensemble with many trees only once"
Does anyone know what is meant by "Estimation phase: Create a big ensemble with many trees only once."? Answer or follow-up question 3
Creating, fitting, building or estimating a model means making a model. It is referred to as the estimation phase and the output is a
Deploying, or predicting with, a model means using the model to make predictions. This is referred to as the deployment phase and the
output is a prediction (for each instance).
Sign in to be able to add an answer or mark this question as resolved.