Hi,
I've got a data frame with multiple factor columns, but they should share
the same set of labels, such as this tiny example:
df <- data.frame (
a = factor( c( "bob", "alice", "bob" ) ),
b = factor( c( "kenny", "alice", "alice" ) )
);
In my data, though, the strings are enormous. I would like to replace them
with integers, which would take up so much less space that I could actually
understand the screen display. Converting a single factor column to
integers is easy: You can just use as.numeric(). But sharing the labels
across two vectors has eluded me.
Here's what I tried: First, collect all the levels into a single vector:
> allLevels <- unique( c( levels( df$a ), levels( df$b ) ) )
> allLevels
[1] "alice" "bob" "kenny"
Now change the "levels" attribute of each vector:
> for (c in colnames(df)) levels( df[,c] ) <- allLevels
> data.frame (
+ as.numeric( a ),
+ as.numeric( b )
+ )
as.numeric.a. as.numeric.b.
1 2 2
2 1 1
3 2 1
What happened to Kenny?
Thanks in advance,
Jeff
--
View this message in context:
http://n4.nabble.com/Sharing-levels-across-multiple-factor-vectors-tp1747714p1747714.html
Sent from the R help mailing list archive at Nabble.com.