james.arnold at sssc.uk.com
2009-Sep-08 16:01 UTC
[R] Mapping factors to a new set of factors
Hello,
I am trying to map a factor variable within a data frame to a new variable whose
entries are derived from the content of the original variable and there are
fewer factors in the new variable. That is, I'm trying to set up a
surjection.
After first thinking that this would be a common operation and would have a
quite simple interface, I can not seem to find one, nor any similar posts on
this topic (please correct me if there is something).
Therefore, I have written a function to perform this mapping. However, the
function I have written doesn't seem to work with vectors greater than
length 1, and as such is useless. Is there any way to ensure the function would
work appropriately for each element of the vector input?
mapLN <- function(x)
{
Reg <- levels(df$Var1)
if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] | x==Reg[23] |
x==Reg[27]) {"North"} else
if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] | x==Reg[24] |
x==Reg[30]) {"East"} else
if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] | x==Reg[16] |
x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] | x==Reg[29] |
x==Reg[31]) {"West"} else
if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32])
{"South"} else
stop("Not in original set")
}
Many thanks,
James
This E-Mail is confidential and intended solely for the use of the individual to
whom it is addressed.? If you are not the addressee, any disclosure,
reproduction, copying, distribution or other dissemination or use of this
communication is strictly prohibited.? If you have received this transmission in
error please notify the sender immediately by replying to this e-mail, or
telephone 01382 207 222, and then delete this e-mail.
All outgoing messages are checked for viruses however no guarantee is given that
this e-mail message, and any attachments, are free from viruses. You are
strongly recommend to check for viruses using your own virus scanner. Neither
SCRC or SSSC will accept responsibility for any damage caused as a result of
virus infection.
use 'ifelse'
# not tested; you supply data for the '%in%'
map <- function(x){
ifelse(x %in% c('a','b'), "North",
ifelse(x %in% c('c','d'), "South",
ifelse(x %in% c('e', 'f'), "East",
ifelse(x %in% c('g', 'h'), "West", NA))))
}
On Tue, Sep 8, 2009 at 12:01 PM, <james.arnold at sssc.uk.com>
wrote:> Hello,
>
> I am trying to map a factor variable within a data frame to a new variable
whose entries are derived from the content of the original variable and there
are fewer factors in the new variable. That is, I'm trying to set up a
surjection.
>
> After first thinking that this would be a common operation and would have a
quite simple interface, I can not seem to find one, nor any similar posts on
this topic (please correct me if there is something).
>
> Therefore, I have written a function to perform this mapping. However, the
function I have written doesn't seem to work with vectors greater than
length 1, and as such is useless. Is there any way to ensure the function would
work appropriately for each element of the vector input?
>
> mapLN <- function(x)
> {
> ? ? ? ?Reg <- levels(df$Var1)
> ? ? ? ?if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] |
x==Reg[23] | x==Reg[27]) {"North"} else
> ? ? ? ?if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] |
x==Reg[24] | x==Reg[30]) {"East"} else
> ? ? ? ?if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] |
x==Reg[16] | x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] |
x==Reg[29] | x==Reg[31]) {"West"} else
> ? ? ? ?if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32])
{"South"} else
> ? ? ? ?stop("Not in original set")
> }
>
> Many thanks,
> James
>
> This E-Mail is confidential and intended solely for the use of the
individual to whom it is addressed.? If you are not the addressee, any
disclosure, reproduction, copying, distribution or other dissemination or use of
this communication is strictly prohibited.? If you have received this
transmission in error please notify the sender immediately by replying to this
e-mail, or telephone 01382 207 222, and then delete this e-mail.
>
> All outgoing messages are checked for viruses however no guarantee is given
that this e-mail message, and any attachments, are free from viruses. ?You are
strongly recommend to check for viruses using your own virus scanner. ?Neither
SCRC or SSSC will accept responsibility for any damage caused as a result of
virus infection.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
Hi James, Take a look at the "recode" function in the "car" package. It might be useful in this case. HTH, Jorge On Tue, Sep 8, 2009 at 12:01 PM, <james.arnold@sssc.uk.com> wrote:> Hello, > > I am trying to map a factor variable within a data frame to a new variable > whose entries are derived from the content of the original variable and > there are fewer factors in the new variable. That is, I'm trying to set up a > surjection. > > After first thinking that this would be a common operation and would have a > quite simple interface, I can not seem to find one, nor any similar posts on > this topic (please correct me if there is something). > > Therefore, I have written a function to perform this mapping. However, the > function I have written doesn't seem to work with vectors greater than > length 1, and as such is useless. Is there any way to ensure the function > would work appropriately for each element of the vector input? > > mapLN <- function(x) > { > Reg <- levels(df$Var1) > if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] | > x==Reg[23] | x==Reg[27]) {"North"} else > if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] | > x==Reg[24] | x==Reg[30]) {"East"} else > if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] | > x==Reg[16] | x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] > | x==Reg[29] | x==Reg[31]) {"West"} else > if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32]) > {"South"} else > stop("Not in original set") > } > > Many thanks, > James > > This E-Mail is confidential and intended solely for the use of the > individual to whom it is addressed. If you are not the addressee, any > disclosure, reproduction, copying, distribution or other dissemination or > use of this communication is strictly prohibited. If you have received this > transmission in error please notify the sender immediately by replying to > this e-mail, or telephone 01382 207 222, and then delete this e-mail. > > All outgoing messages are checked for viruses however no guarantee is given > that this e-mail message, and any attachments, are free from viruses. You > are strongly recommend to check for viruses using your own virus scanner. > Neither SCRC or SSSC will accept responsibility for any damage caused as a > result of virus infection. > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
james.arnold at sssc.uk.com
2009-Sep-11 12:48 UTC
[R] Mapping factors to a new set of factors
Thanks a lot Phil,
Recode is exactly what I was looking for. I managed to get my old function
working using sapply, but the performance was horrendously slow!
One other thing was that the lvls vector can only seem to be set within the
global scope of R, and local variables within a function do not seem to be able
to be seen within the scope of a function that sets that variable and calls
recode.
Thanks,
James
-----Original Message-----
From: Phil Spector [mailto:spector at stat.berkeley.edu]
Sent: 08 September 2009 22:25
To: Arnold, James
Subject: Re: [R] Mapping factors to a new set of factors
James -
If you need to do something like this, I strongly recommend
the recode function of the car package. You can use it like this:
library(car)
recode(x,'lvls[c(1,2,13,17,20,23,27)]="North";
lvls[c(3,5,7,14,15,24,30)] ="East";
lvls[c(4,6,8,9,11,16,18,21,22,25,28,29,31)]="West";
lvls[c(10,12,19,26,32)]="South";
else="Not In Original Set"')
Including the as.factor=FALSE argument to recode will return
a character vector -- by default it returns a factor.
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Tue, 8 Sep 2009, james.arnold at sssc.uk.com wrote:
> Hello,
>
> I am trying to map a factor variable within a data frame to a new variable
whose entries are derived from the content of the original variable and there
are fewer factors in the new variable. That is, I'm trying to set up a
surjection.
>
> After first thinking that this would be a common operation and would have a
quite simple interface, I can not seem to find one, nor any similar posts on
this topic (please correct me if there is something).
>
> Therefore, I have written a function to perform this mapping. However, the
function I have written doesn't seem to work with vectors greater than
length 1, and as such is useless. Is there any way to ensure the function would
work appropriately for each element of the vector input?
>
> mapLN <- function(x)
> {
> Reg <- levels(df$Var1)
> if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] |
x==Reg[23] | x==Reg[27]) {"North"} else
> if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] |
x==Reg[24] | x==Reg[30]) {"East"} else
> if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] |
x==Reg[16] | x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] |
x==Reg[29] | x==Reg[31]) {"West"} else
> if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32])
{"South"} else
> stop("Not in original set")
> }
>
> Many thanks,
> James
>
> This E-Mail is confidential and intended solely for the use of the
individual to whom it is addressed.? If you are not the addressee, any
disclosure, reproduction, copying, distribution or other dissemination or use of
this communication is strictly prohibited.? If you have received this
transmission in error please notify the sender immediately by replying to this
e-mail, or telephone 01382 207 222, and then delete this e-mail.
>
> All outgoing messages are checked for viruses however no guarantee is given
that this e-mail message, and any attachments, are free from viruses. You are
strongly recommend to check for viruses using your own virus scanner. Neither
SCRC or SSSC will accept responsibility for any damage caused as a result of
virus infection.
>
>
This E-Mail is confidential and intended solely for the use of the individual to
whom it is addressed.? If you are not the addressee, any disclosure,
reproduction, copying, distribution or other dissemination or use of this
communication is strictly prohibited.? If you have received this transmission in
error please notify the sender immediately by replying to this e-mail, or
telephone 01382 207 222, and then delete this e-mail.
All outgoing messages are checked for viruses however no guarantee is given that
this e-mail message, and any attachments, are free from viruses. You are
strongly recommend to check for viruses using your own virus scanner. Neither
SCRC or SSSC will accept responsibility for any damage caused as a result of
virus infection.