Greg Blevins
2004-Jun-08 02:28 UTC
[R] Recoding a multiple response question into a series of 1, 0 variables
Hello R folks. 1) The question that generated the data, which I call Qx: Which of the following 5 items have you performed in the past month? (multipe response) 2) How the data is coded in my current dataframe: The first item that a person selected is coded under a field called Qxfirst; the second selected under Qxsecond, etc. For the first Person, the NAs mean that that person only selected two of the five items. Hypothetical data is shown Qxfirst Qxsecond Qxthird Qxfourth Qxfifth Person1 4 2 NA NA NA Person2 1 3 4 5 NA Person3 3 2 NA NA NA 3) How I want the data to be be coded: I want each field to be one of the five items and I want each field to contain a 1 or 0 code--1 if they mentioned the item, 0 otherwise. Given the above data, the new fields would look as follows: Item1 Item2 Item3 Item4 Item5 Person1 0 1 0 1 0 Person2 1 0 1 1 1 Person3 0 1 1 0 0 I know how to do this using brute force (by writing bunch of ifelse statements), but given I have quite a lot of data in the above format, I was hoping a function would streamline this--I have tried to create a function for this, but my efforts to date have turned up junk. Thanks Greg Blevins The Market Solutions Group [[alternative HTML version deleted]]
Jonathan Baron
2004-Jun-08 02:45 UTC
[R] Recoding a multiple response question into a series of 1, 0 variables
On 06/07/04 21:28, Greg Blevins wrote:>Hello R folks. > >1) The question that generated the data, which I call Qx: >Which of the following 5 items have you performed in the past month? (multipe >response) > >2) How the data is coded in my current dataframe: >The first item that a person selected is coded under a field called Qxfirst; the >second selected under Qxsecond, etc. For the first Person, the NAs mean that that >person only selected two of the five items. > >Hypothetical data is shown > > Qxfirst Qxsecond Qxthird Qxfourth Qxfifth >Person1 4 2 NA NA NA >Person2 1 3 4 5 NA >Person3 3 2 NA NA NA > >3) How I want the data to be be coded: > >I want each field to be one of the five items and I want each field to contain a 1 or >0 code--1 if they mentioned the item, 0 otherwise. > >Given the above data, the new fields would look as follows: > > Item1 Item2 Item3 Item4 Item5 >Person1 0 1 0 1 0 >Person2 1 0 1 1 1 >Person3 0 1 1 0 0Here is an idea: X <- c(4,5,NA,NA,NA) # one row Y <- rep(NA,5) # an empty row Y[X] <- 1 Y is now NA NA NA 1 1 which is what you want. So you need to do this on each row and then convert the NAs to 0s. So first create an empty data frame, the same size as your original one X, like my Y. Callit Y. Then a loop? (I can't think of a better way just now, like with mapply.) for (i in [whatever]) Y[i][X[i]] <- 1 (Not tested.) Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron R search page: http://finzi.psych.upenn.edu/