Question



Max AUC (bagged trees)

Because we are sampling, each time we run the bagged trees code for a specific ensemble size, our AUC changes. How then do we choose an
ensemble size that gives the highest AUC?





Answers and follow-up questions





Answer or follow-up question 1

Dear student,

The way you want to approach this is as follows:

Estimation phase: Create a big ensemble with many trees only once

Deployment phase: deploy the ensemble multiple times for k=1..all, with k the ensemble size and store the AUC for each deployment. Start
with the first tree you have built, and end with the last one. Where the AUC maxes out, you have the optimal ensemble size.

If you redo the estimation phase, of course the optimal k will differ, as the model will be different.

Michel Ballings




Answer or follow-up question 2

What is meant by "Estimation phase: Create a big ensemble with many trees only once"

Does anyone know what is meant by "Estimation phase: Create a big ensemble with many trees only once."?


Answer or follow-up question 3

Dear student,

Creating, fitting, building or estimating a model means making a model. It is referred to as the estimation phase and the output is a
model.

Deploying, or predicting with, a model means using the model to make predictions. This is referred to as the deployment phase and the
output is a prediction (for each instance).

Michel Ballings





Sign in to be able to add an answer or mark this question as resolved.