Greg Blevins
2004-Jun-08  02:28 UTC
[R] Recoding a multiple response question into a series of 1, 0 variables
Hello R folks.
1) The question that generated the data, which I call Qx:
Which of the following 5 items have you performed in the past month?  (multipe
response)
2) How the data is coded in my current dataframe:
The first item that a person selected is coded under a field called Qxfirst; the
second selected under Qxsecond, etc.  For the first Person, the NAs mean that
that person only selected two of the five items.
Hypothetical data is shown
                    Qxfirst    Qxsecond    Qxthird    Qxfourth    Qxfifth
Person1        4            2                NA            NA            NA
Person2        1            3                4               5               NA
Person3        3            2                NA            NA            NA
3) How I want the data to be be coded:
I want each field to be one of the five items and I want each field to contain a
1 or 0 code--1 if they mentioned the item, 0 otherwise.
Given the above data, the new fields would look as follows:
                    Item1    Item2        Item3            Item4        Item5
Person1        0            1               0                1               0  
Person2        1            0               1                1               1
Person3        0            1               1                0               0
I know how to do this using brute force (by writing bunch of ifelse statements),
but given I have quite a lot of data in the above format, I was hoping a
function would streamline this--I have tried to create a function for this, but
my efforts to date have turned up junk.
Thanks
Greg Blevins
The Market Solutions Group 
	[[alternative HTML version deleted]]
Jonathan Baron
2004-Jun-08  02:45 UTC
[R] Recoding a multiple response question into a series of 1, 0 variables
On 06/07/04 21:28, Greg Blevins wrote:>Hello R folks. > >1) The question that generated the data, which I call Qx: >Which of the following 5 items have you performed in the past month? (multipe >response) > >2) How the data is coded in my current dataframe: >The first item that a person selected is coded under a field called Qxfirst; the >second selected under Qxsecond, etc. For the first Person, the NAs mean that that >person only selected two of the five items. > >Hypothetical data is shown > > Qxfirst Qxsecond Qxthird Qxfourth Qxfifth >Person1 4 2 NA NA NA >Person2 1 3 4 5 NA >Person3 3 2 NA NA NA > >3) How I want the data to be be coded: > >I want each field to be one of the five items and I want each field to contain a 1 or >0 code--1 if they mentioned the item, 0 otherwise. > >Given the above data, the new fields would look as follows: > > Item1 Item2 Item3 Item4 Item5 >Person1 0 1 0 1 0 >Person2 1 0 1 1 1 >Person3 0 1 1 0 0Here is an idea: X <- c(4,5,NA,NA,NA) # one row Y <- rep(NA,5) # an empty row Y[X] <- 1 Y is now NA NA NA 1 1 which is what you want. So you need to do this on each row and then convert the NAs to 0s. So first create an empty data frame, the same size as your original one X, like my Y. Callit Y. Then a loop? (I can't think of a better way just now, like with mapply.) for (i in [whatever]) Y[i][X[i]] <- 1 (Not tested.) Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron R search page: http://finzi.psych.upenn.edu/