Dimitri Liakhovitski
2008-Jan-22 17:25 UTC
[R] recoding one variable into another - but differently for different cases
Hello, I have 2 variables in my sample Data: Data$A and Data$B Variable Data$A can assume values: 1, 2, 3, and 4. Variable Data$B identifies my cases and can assume values: 1 and 2. I need to recode my variable Data$A into a new variable Data$new such that: People who are Data[Data$B %in% 1, ] are recoded like this: Value on Data$A Value on Data$new 1 +1 2 -1 3 0 4 99 People who are Data[Data$B %in% 2, ] are recoded like this: Value on Data$A Value on Data$new 1 -1 2 +1 3 0 4 99 I am having hard time doing this. Any help would be greatly appreciated. Dimitri
Marc Schwartz
2008-Jan-22 18:29 UTC
[R] recoding one variable into another - but differently for different cases
Dimitri Liakhovitski wrote:> Hello, > I have 2 variables in my sample Data: Data$A and Data$B > Variable Data$A can assume values: 1, 2, 3, and 4. > Variable Data$B identifies my cases and can assume values: 1 and 2. > > I need to recode my variable Data$A into a new variable Data$new such that: > > People who are Data[Data$B %in% 1, ] are recoded like this: > > Value on Data$A Value on Data$new > 1 +1 > 2 -1 > 3 0 > 4 99 > > People who are Data[Data$B %in% 2, ] are recoded like this: > > Value on Data$A Value on Data$new > 1 -1 > 2 +1 > 3 0 > 4 99 > > I am having hard time doing this. Any help would be greatly appreciated. > DimitriHow about this:> DataA B 14 2 2 12 4 2 6 2 1 10 2 2 15 3 2 9 1 2 8 4 1 3 3 1 4 4 1 11 3 2 16 4 2 5 1 1 2 2 1 7 3 1 13 1 2 1 1 1 # Create a vector of the codes, in sequence for each subset Codes <- c("+1", "-1", "0", "99", "-1", "+1", "0", "99") # Create 'new' using indices into 'Codes' Data$new <- with(Data, Codes[((B - 1) * 4) + A])> DataA B new 14 2 2 +1 12 4 2 99 6 2 1 -1 10 2 2 +1 15 3 2 0 9 1 2 -1 8 4 1 99 3 3 1 0 4 4 1 99 11 3 2 0 16 4 2 99 5 1 1 +1 2 2 1 -1 7 3 1 0 13 1 2 -1 1 1 1 +1 HTH, Marc Schwartz
Gabor Grothendieck
2008-Jan-22 19:08 UTC
[R] recoding one variable into another - but differently for different cases
You could create a lookup table or use recode in the car package. Another possibility is to use a logical/arithmetic expression. The following expression says that - if A is 1 then use the first term equals the coefficient, namely 1 if B ==1 and -1 if B == 2. Also, if A is not 1 then that term is zero and can be ignored. - if A is 2 or 99 then the second or third terms are used analogously - otherwise no terms are selected and the expression equals zero transform(Data, new (A == 1) * ((B == 1) - (B == 2)) + (A == 2) * ((B == 2) - (B =1)) + (A == 4) * 99) This could be reduced even more although at the expense of understandability, e.g. transform(Data, new = ifelse(A > 2, 99 * (A == 4), (A == B) - (A != B))) On Jan 22, 2008 12:25 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote:> Hello, > I have 2 variables in my sample Data: Data$A and Data$B > Variable Data$A can assume values: 1, 2, 3, and 4. > Variable Data$B identifies my cases and can assume values: 1 and 2. > > I need to recode my variable Data$A into a new variable Data$new such that: > > People who are Data[Data$B %in% 1, ] are recoded like this: > > Value on Data$A Value on Data$new > 1 +1 > 2 -1 > 3 0 > 4 99 > > People who are Data[Data$B %in% 2, ] are recoded like this: > > Value on Data$A Value on Data$new > 1 -1 > 2 +1 > 3 0 > 4 99 > > I am having hard time doing this. Any help would be greatly appreciated. > Dimitri > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Gabor Grothendieck
2008-Jan-22 22:01 UTC
[R] recoding one variable into another - but differently for different cases
Slight correction of the English: - if A is 1 then the first term equals the coefficient of (A == 1). That is the first term equals 1 if B==1 and equals -1 if B==2. If A does not equal 1 then the first term is zero and can be ignored. - terms 2 and 3 are interpreted analogously - if A==3 (or other value) then the (A==?) part of each term equals zero so the entire expression is zero. On Jan 22, 2008 2:08 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> You could create a lookup table or use recode in the car package. > > Another possibility is to use a logical/arithmetic expression. The > following expression says that > > - if A is 1 then use the first term equals the coefficient, namely 1 > if B ==1 and -1 if B == 2. > Also, if A is not 1 then that term is zero and can be ignored. > - if A is 2 or 99 then the second or third terms are used analogously > - otherwise no terms are selected and the expression equals zero > > transform(Data, new > (A == 1) * ((B == 1) - (B == 2)) + (A == 2) * ((B == 2) - (B => 1)) + (A == 4) * 99) > > This could be reduced even more although at the expense of > understandability, e.g. > > transform(Data, new = ifelse(A > 2, 99 * (A == 4), (A == B) - (A != B))) > > > On Jan 22, 2008 12:25 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote: > > Hello, > > I have 2 variables in my sample Data: Data$A and Data$B > > Variable Data$A can assume values: 1, 2, 3, and 4. > > Variable Data$B identifies my cases and can assume values: 1 and 2. > > > > I need to recode my variable Data$A into a new variable Data$new such that: > > > > People who are Data[Data$B %in% 1, ] are recoded like this: > > > > Value on Data$A Value on Data$new > > 1 +1 > > 2 -1 > > 3 0 > > 4 99 > > > > People who are Data[Data$B %in% 2, ] are recoded like this: > > > > Value on Data$A Value on Data$new > > 1 -1 > > 2 +1 > > 3 0 > > 4 99 > > > > I am having hard time doing this. Any help would be greatly appreciated. > > Dimitri > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > >
Dimitri Liakhovitski
2008-Jan-22 23:40 UTC
[R] recoding one variable into another - but differently for different cases
Thanks a lot, everyone! Dimitri On 1/22/08, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> Slight correction of the English: > > - if A is 1 then the first term equals the coefficient of (A == 1). > That is the first term equals 1 if B==1 and equals -1 if B==2. > If A does not equal 1 then the first term is zero and can be ignored. > - terms 2 and 3 are interpreted analogously > - if A==3 (or other value) then the (A==?) part of each term equals > zero so the entire expression is zero. > > On Jan 22, 2008 2:08 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote: > > You could create a lookup table or use recode in the car package. > > > > Another possibility is to use a logical/arithmetic expression. The > > following expression says that > > > > - if A is 1 then use the first term equals the coefficient, namely 1 > > if B ==1 and -1 if B == 2. > > Also, if A is not 1 then that term is zero and can be ignored. > > - if A is 2 or 99 then the second or third terms are used analogously > > - otherwise no terms are selected and the expression equals zero > > > > transform(Data, new > > (A == 1) * ((B == 1) - (B == 2)) + (A == 2) * ((B == 2) - (B => > 1)) + (A == 4) * 99) > > > > This could be reduced even more although at the expense of > > understandability, e.g. > > > > transform(Data, new = ifelse(A > 2, 99 * (A == 4), (A == B) - (A != B))) > > > > > > On Jan 22, 2008 12:25 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote: > > > Hello, > > > I have 2 variables in my sample Data: Data$A and Data$B > > > Variable Data$A can assume values: 1, 2, 3, and 4. > > > Variable Data$B identifies my cases and can assume values: 1 and 2. > > > > > > I need to recode my variable Data$A into a new variable Data$new such that: > > > > > > People who are Data[Data$B %in% 1, ] are recoded like this: > > > > > > Value on Data$A Value on Data$new > > > 1 +1 > > > 2 -1 > > > 3 0 > > > 4 99 > > > > > > People who are Data[Data$B %in% 2, ] are recoded like this: > > > > > > Value on Data$A Value on Data$new > > > 1 -1 > > > 2 +1 > > > 3 0 > > > 4 99 > > > > > > I am having hard time doing this. Any help would be greatly appreciated. > > > Dimitri > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > >
hadley wickham
2008-Jan-23 01:02 UTC
[R] recoding one variable into another - but differently for different cases
No one else mentioned this, but if those 99s represent missings, you should be using NA not a special numeric value. Hadley On Jan 22, 2008 5:40 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote:> Thanks a lot, everyone! > Dimitri > > > On 1/22/08, Gabor Grothendieck <ggrothendieck at gmail.com> wrote: > > Slight correction of the English: > > > > - if A is 1 then the first term equals the coefficient of (A == 1). > > That is the first term equals 1 if B==1 and equals -1 if B==2. > > If A does not equal 1 then the first term is zero and can be ignored. > > - terms 2 and 3 are interpreted analogously > > - if A==3 (or other value) then the (A==?) part of each term equals > > zero so the entire expression is zero. > > > > On Jan 22, 2008 2:08 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote: > > > You could create a lookup table or use recode in the car package. > > > > > > Another possibility is to use a logical/arithmetic expression. The > > > following expression says that > > > > > > - if A is 1 then use the first term equals the coefficient, namely 1 > > > if B ==1 and -1 if B == 2. > > > Also, if A is not 1 then that term is zero and can be ignored. > > > - if A is 2 or 99 then the second or third terms are used analogously > > > - otherwise no terms are selected and the expression equals zero > > > > > > transform(Data, new > > > (A == 1) * ((B == 1) - (B == 2)) + (A == 2) * ((B == 2) - (B => > > 1)) + (A == 4) * 99) > > > > > > This could be reduced even more although at the expense of > > > understandability, e.g. > > > > > > transform(Data, new = ifelse(A > 2, 99 * (A == 4), (A == B) - (A != B))) > > > > > > > > > On Jan 22, 2008 12:25 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote: > > > > Hello, > > > > I have 2 variables in my sample Data: Data$A and Data$B > > > > Variable Data$A can assume values: 1, 2, 3, and 4. > > > > Variable Data$B identifies my cases and can assume values: 1 and 2. > > > > > > > > I need to recode my variable Data$A into a new variable Data$new such that: > > > > > > > > People who are Data[Data$B %in% 1, ] are recoded like this: > > > > > > > > Value on Data$A Value on Data$new > > > > 1 +1 > > > > 2 -1 > > > > 3 0 > > > > 4 99 > > > > > > > > People who are Data[Data$B %in% 2, ] are recoded like this: > > > > > > > > Value on Data$A Value on Data$new > > > > 1 -1 > > > > 2 +1 > > > > 3 0 > > > > 4 99 > > > > > > > > I am having hard time doing this. Any help would be greatly appreciated. > > > > Dimitri > > > > > > > > ______________________________________________ > > > > R-help at r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- http://had.co.nz/
Gabor Grothendieck
2008-Jan-23 02:13 UTC
[R] recoding one variable into another - but differently for different cases
Note that if you do use NA rather than 99 as others have suggested then the A==4 term should use ifelse rather than multiplication since 0 * NA = NA, not 0: transform(Data, new (A == 1) * ((B == 1) - (B == 2)) + (A == 2) * ((B == 2) - (B ==1)) + ifelse(A == 4, NA, 0)) In fact, although more verbose, if you find ifelse clearer you might use it in all three terms: On Jan 22, 2008 5:01 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> Slight correction of the English: > > - if A is 1 then the first term equals the coefficient of (A == 1). > That is the first term equals 1 if B==1 and equals -1 if B==2. > If A does not equal 1 then the first term is zero and can be ignored. > - terms 2 and 3 are interpreted analogously > - if A==3 (or other value) then the (A==?) part of each term equals > zero so the entire expression is zero. > > > On Jan 22, 2008 2:08 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote: > > You could create a lookup table or use recode in the car package. > > > > Another possibility is to use a logical/arithmetic expression. The > > following expression says that > > > > - if A is 1 then use the first term equals the coefficient, namely 1 > > if B ==1 and -1 if B == 2. > > Also, if A is not 1 then that term is zero and can be ignored. > > - if A is 2 or 99 then the second or third terms are used analogously > > - otherwise no terms are selected and the expression equals zero > > > > transform(Data, new > > (A == 1) * ((B == 1) - (B == 2)) + (A == 2) * ((B == 2) - (B => > 1)) + (A == 4) * 99) > > > > This could be reduced even more although at the expense of > > understandability, e.g. > > > > transform(Data, new = ifelse(A > 2, 99 * (A == 4), (A == B) - (A != B))) > > > > > > On Jan 22, 2008 12:25 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote: > > > Hello, > > > I have 2 variables in my sample Data: Data$A and Data$B > > > Variable Data$A can assume values: 1, 2, 3, and 4. > > > Variable Data$B identifies my cases and can assume values: 1 and 2. > > > > > > I need to recode my variable Data$A into a new variable Data$new such that: > > > > > > People who are Data[Data$B %in% 1, ] are recoded like this: > > > > > > Value on Data$A Value on Data$new > > > 1 +1 > > > 2 -1 > > > 3 0 > > > 4 99 > > > > > > People who are Data[Data$B %in% 2, ] are recoded like this: > > > > > > Value on Data$A Value on Data$new > > > 1 -1 > > > 2 +1 > > > 3 0 > > > 4 99 > > > > > > I am having hard time doing this. Any help would be greatly appreciated. > > > Dimitri > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > >