Question



Error: undefined columns selected, when running dummy(x)

Hi Dr. Ballings,

I'm having trouble figuring out exercise 2. After installing the dummy package, creating the data frame, and running 'dummy(x)'. I get the
following error:

'Error in `[.data.frame`(x, , z) : undefined columns selected'

I have spoken to many people and they all say similar things.

Could you help me?

Best, Ian Safie





Answers and follow-up questions





Answer or follow-up question 1

Hi Ian,

That error is intentional and part of solving the problem. Whenever you get an error the first thing to do is read the documentation.
In this case you would run ?dummy. Then read the description of the argument and make sure that your argument fulfills all requirements.

Michel Ballings


Answer or follow-up question 2

Dr. Ballings,

I think that part of the problem is that x needs to be a data frame that contains a character or a factor vector.

I've tried using as.factor and as.character to change the data frame into factors/characters, but that doesn't seem to work.

Am I on the right track with using the as.factor command? Is there a particular column or row that I need to change to factors/characters?

At the moment, I don't know what approach I should take next to solving this problem. Could you offer some advice please?

John


Answer or follow-up question 3

As I understand the documentation, dummy requires factors so the data can be sorted accordingly.

I haven't found a combination of sapply, as.factor, and x to create the dataframe we should pass to dummy().

I tried to create a new column of factors to be appended onto x but I assume that's neither the correct approach nor can I make it work.


Answer or follow-up question 4

The first line of the documentation (when you do ?dummy) says 'dummy creates dummy variables of all the factors and character vectors
in a data frame.'

This means that we want all variables that we want to create dummy variables for to be factors or characters vectors.

In class we have seen that this can easily be accomplished using as.factor() or as.character(). If you have a lot of variables the best
way to do that is in combination with sapply().

Michel Ballings



Sign in to be able to add an answer or mark this question as resolved.