james.arnold at sssc.uk.com
2009-Sep-08 16:01 UTC
[R] Mapping factors to a new set of factors
Hello, I am trying to map a factor variable within a data frame to a new variable whose entries are derived from the content of the original variable and there are fewer factors in the new variable. That is, I'm trying to set up a surjection. After first thinking that this would be a common operation and would have a quite simple interface, I can not seem to find one, nor any similar posts on this topic (please correct me if there is something). Therefore, I have written a function to perform this mapping. However, the function I have written doesn't seem to work with vectors greater than length 1, and as such is useless. Is there any way to ensure the function would work appropriately for each element of the vector input? mapLN <- function(x) { Reg <- levels(df$Var1) if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] | x==Reg[23] | x==Reg[27]) {"North"} else if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] | x==Reg[24] | x==Reg[30]) {"East"} else if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] | x==Reg[16] | x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] | x==Reg[29] | x==Reg[31]) {"West"} else if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32]) {"South"} else stop("Not in original set") } Many thanks, James This E-Mail is confidential and intended solely for the use of the individual to whom it is addressed.? If you are not the addressee, any disclosure, reproduction, copying, distribution or other dissemination or use of this communication is strictly prohibited.? If you have received this transmission in error please notify the sender immediately by replying to this e-mail, or telephone 01382 207 222, and then delete this e-mail. All outgoing messages are checked for viruses however no guarantee is given that this e-mail message, and any attachments, are free from viruses. You are strongly recommend to check for viruses using your own virus scanner. Neither SCRC or SSSC will accept responsibility for any damage caused as a result of virus infection.
use 'ifelse' # not tested; you supply data for the '%in%' map <- function(x){ ifelse(x %in% c('a','b'), "North", ifelse(x %in% c('c','d'), "South", ifelse(x %in% c('e', 'f'), "East", ifelse(x %in% c('g', 'h'), "West", NA)))) } On Tue, Sep 8, 2009 at 12:01 PM, <james.arnold at sssc.uk.com> wrote:> Hello, > > I am trying to map a factor variable within a data frame to a new variable whose entries are derived from the content of the original variable and there are fewer factors in the new variable. That is, I'm trying to set up a surjection. > > After first thinking that this would be a common operation and would have a quite simple interface, I can not seem to find one, nor any similar posts on this topic (please correct me if there is something). > > Therefore, I have written a function to perform this mapping. However, the function I have written doesn't seem to work with vectors greater than length 1, and as such is useless. Is there any way to ensure the function would work appropriately for each element of the vector input? > > mapLN <- function(x) > { > ? ? ? ?Reg <- levels(df$Var1) > ? ? ? ?if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] | x==Reg[23] | x==Reg[27]) {"North"} else > ? ? ? ?if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] | x==Reg[24] | x==Reg[30]) {"East"} else > ? ? ? ?if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] | x==Reg[16] | x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] | x==Reg[29] | x==Reg[31]) {"West"} else > ? ? ? ?if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32]) {"South"} else > ? ? ? ?stop("Not in original set") > } > > Many thanks, > James > > This E-Mail is confidential and intended solely for the use of the individual to whom it is addressed.? If you are not the addressee, any disclosure, reproduction, copying, distribution or other dissemination or use of this communication is strictly prohibited.? If you have received this transmission in error please notify the sender immediately by replying to this e-mail, or telephone 01382 207 222, and then delete this e-mail. > > All outgoing messages are checked for viruses however no guarantee is given that this e-mail message, and any attachments, are free from viruses. ?You are strongly recommend to check for viruses using your own virus scanner. ?Neither SCRC or SSSC will accept responsibility for any damage caused as a result of virus infection. > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Hi James, Take a look at the "recode" function in the "car" package. It might be useful in this case. HTH, Jorge On Tue, Sep 8, 2009 at 12:01 PM, <james.arnold@sssc.uk.com> wrote:> Hello, > > I am trying to map a factor variable within a data frame to a new variable > whose entries are derived from the content of the original variable and > there are fewer factors in the new variable. That is, I'm trying to set up a > surjection. > > After first thinking that this would be a common operation and would have a > quite simple interface, I can not seem to find one, nor any similar posts on > this topic (please correct me if there is something). > > Therefore, I have written a function to perform this mapping. However, the > function I have written doesn't seem to work with vectors greater than > length 1, and as such is useless. Is there any way to ensure the function > would work appropriately for each element of the vector input? > > mapLN <- function(x) > { > Reg <- levels(df$Var1) > if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] | > x==Reg[23] | x==Reg[27]) {"North"} else > if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] | > x==Reg[24] | x==Reg[30]) {"East"} else > if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] | > x==Reg[16] | x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] > | x==Reg[29] | x==Reg[31]) {"West"} else > if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32]) > {"South"} else > stop("Not in original set") > } > > Many thanks, > James > > This E-Mail is confidential and intended solely for the use of the > individual to whom it is addressed. If you are not the addressee, any > disclosure, reproduction, copying, distribution or other dissemination or > use of this communication is strictly prohibited. If you have received this > transmission in error please notify the sender immediately by replying to > this e-mail, or telephone 01382 207 222, and then delete this e-mail. > > All outgoing messages are checked for viruses however no guarantee is given > that this e-mail message, and any attachments, are free from viruses. You > are strongly recommend to check for viruses using your own virus scanner. > Neither SCRC or SSSC will accept responsibility for any damage caused as a > result of virus infection. > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
james.arnold at sssc.uk.com
2009-Sep-11 12:48 UTC
[R] Mapping factors to a new set of factors
Thanks a lot Phil, Recode is exactly what I was looking for. I managed to get my old function working using sapply, but the performance was horrendously slow! One other thing was that the lvls vector can only seem to be set within the global scope of R, and local variables within a function do not seem to be able to be seen within the scope of a function that sets that variable and calls recode. Thanks, James -----Original Message----- From: Phil Spector [mailto:spector at stat.berkeley.edu] Sent: 08 September 2009 22:25 To: Arnold, James Subject: Re: [R] Mapping factors to a new set of factors James - If you need to do something like this, I strongly recommend the recode function of the car package. You can use it like this: library(car) recode(x,'lvls[c(1,2,13,17,20,23,27)]="North"; lvls[c(3,5,7,14,15,24,30)] ="East"; lvls[c(4,6,8,9,11,16,18,21,22,25,28,29,31)]="West"; lvls[c(10,12,19,26,32)]="South"; else="Not In Original Set"') Including the as.factor=FALSE argument to recode will return a character vector -- by default it returns a factor. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Tue, 8 Sep 2009, james.arnold at sssc.uk.com wrote:> Hello, > > I am trying to map a factor variable within a data frame to a new variable whose entries are derived from the content of the original variable and there are fewer factors in the new variable. That is, I'm trying to set up a surjection. > > After first thinking that this would be a common operation and would have a quite simple interface, I can not seem to find one, nor any similar posts on this topic (please correct me if there is something). > > Therefore, I have written a function to perform this mapping. However, the function I have written doesn't seem to work with vectors greater than length 1, and as such is useless. Is there any way to ensure the function would work appropriately for each element of the vector input? > > mapLN <- function(x) > { > Reg <- levels(df$Var1) > if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] | x==Reg[23] | x==Reg[27]) {"North"} else > if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] | x==Reg[24] | x==Reg[30]) {"East"} else > if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] | x==Reg[16] | x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] | x==Reg[29] | x==Reg[31]) {"West"} else > if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32]) {"South"} else > stop("Not in original set") > } > > Many thanks, > James > > This E-Mail is confidential and intended solely for the use of the individual to whom it is addressed.? If you are not the addressee, any disclosure, reproduction, copying, distribution or other dissemination or use of this communication is strictly prohibited.? If you have received this transmission in error please notify the sender immediately by replying to this e-mail, or telephone 01382 207 222, and then delete this e-mail. > > All outgoing messages are checked for viruses however no guarantee is given that this e-mail message, and any attachments, are free from viruses. You are strongly recommend to check for viruses using your own virus scanner. Neither SCRC or SSSC will accept responsibility for any damage caused as a result of virus infection. > >This E-Mail is confidential and intended solely for the use of the individual to whom it is addressed.? If you are not the addressee, any disclosure, reproduction, copying, distribution or other dissemination or use of this communication is strictly prohibited.? If you have received this transmission in error please notify the sender immediately by replying to this e-mail, or telephone 01382 207 222, and then delete this e-mail. All outgoing messages are checked for viruses however no guarantee is given that this e-mail message, and any attachments, are free from viruses. You are strongly recommend to check for viruses using your own virus scanner. Neither SCRC or SSSC will accept responsibility for any damage caused as a result of virus infection.