When I alter the levels of a factor, why does it alter the names too? f <- factor(c(A="one",B="two",C="one",D="one",E="three"), levels=c("one","two","three")) names(f) -- gives [1] "A" "B" "C" "D" "E" levels(f) <- c("un","deux","trois") names(f) -- gives NULL I'm using R 1.8.0 for Windows. Damon.
names() is only defined for vectors and lists and factors are neither. See ?vector and ?names for more info. --- From: djw1005 at cam.ac.uk Subject: [R] Factor names & levels When I alter the levels of a factor, why does it alter the names too? f <- factor(c(A="one",B="two",C="one",D="one",E="three"), levels=c("one","two","three")) names(f) -- gives [1] "A" "B" "C" "D" "E" levels(f) <- c("un","deux","trois") names(f) -- gives NULL I'm using R 1.8.0 for Windows. Damon.
> names() is only defined for vectors and lists and factors are > neither. See ?vector and ?names for more info.?vector tells me that factors are not vectors, but ?names does not tell me that names() is only defined for vectors and lists. If it were, how should I understand the following?> x <- factor(c("one","three")) > names(x) <- c("fred","jim") > names(x)[1] "fred" "jim"> class(x)[1] "factor"
I agree it may not be 100% clear but ?names does say "The default methods get and set the '"names"' attribute of a vector or list." and if you issue the command: methods("names") you find that the only non-default method is names.dist. Date: Sun, 21 Dec 2003 17:54:25 +0000 (GMT) From: Damon Wischik <djw1005 at cam.ac.uk> To: Gabor Grothendieck <ggrothendieck at myway.com> Cc: <R-help at stat.math.ethz.ch> Subject: Re: [R] Factor names & levels> names() is only defined for vectors and lists and factors are > neither. See ?vector and ?names for more info.?vector tells me that factors are not vectors, but ?names does not tell me that names() is only defined for vectors and lists. If it were, how should I understand the following?> x <- factor(c("one","three")) > names(x) <- c("fred","jim") > names(x)[1] "fred" "jim"> class(x)[1] "factor"
The effect of names() on factors is undefined. The fact that it coincidentally partially works on factors is just chance. For it to be well defined, there would need to be a names method and a names<- method for the factor class or else the default methods would have to be able to handle factors. Its a bit dangerous to rely on the coincidental behavior of functionality, but in the absence of explicit R support, if using names with factors were important to you then you could define your own methods like this: "names<-.factor" <- function( x, value ) { attr(x, "levels") <- value x } names.factor <- function(x) attr( x, "levels" ) # with the above, this works: x <- factor(c("one","three")) names(x) <- c("fred","jim") # implicitly invokes "names<-.factor" names(x) # implicitly invokes names.factor Hope this clears it up for you. --- Date: Mon, 22 Dec 2003 00:38:14 +0000 (GMT) From: Damon Wischik <djw1005 at cam.ac.uk> To: Gabor Grothendieck <ggrothendieck at myway.com> Cc: <R-help at stat.math.ethz.ch> Subject: Re: [R] Factor names & levels> I agree it may not be 100% clear but ?names does say > "The default methods get and set the '"names"' attribute > of a vector or list." and if you issue the command: > methods("names") > you find that the only non-default method is names.dist.I still want to know how I should understand the following:> x <- factor(c("one","three")) > names(x) <- c("fred","jim") > names(x)[1] "fred" "jim" class(x) [1] "factor" Given that names seems to work on factors, I can see two possibilities: 1. It is a bug that it acts as it does; 2. the default method does what it says in the help page, but also does more than just this. I don't know enough to look at the source code to find out what is going on. Damon.
Based on Peter's response, I think I may have misinterpreted Damon's query. The methods I displayed in my last post in this thread were intended to make name a synonym for level. If its desired that name act on factors in the same way that names act on vectors and lists then the methods I provided would not be correct and, as Peter points out, the other factor methods would have to be examined, as well, to ensure that they all work properly with names. I do have one other idea in terms of a workaround. You could represent your factor as a one column data frame. The data frame could then have row names which could be interpreted as names of the factor. For example, f <- data.frame(f = c("A","B","A","C")) row.names(f) <- letters[1:4] You can now refer to the factor as f$f and the names as row.names(f). For example,> f <- data.frame(f = factor(c("A","B","A","C"))) > row.names(f) <- letters[1:4] > ff a A b B c A d C> row.names(f)[1] "a" "b" "c" "d"> f$f[1] A B A C Levels: A B C This is all officially supported by R so it should not get you into trouble although it does require that your program interpret it accordingly. --- Date: 22 Dec 2003 02:30:52 +0100 From: Peter Dalgaard <p.dalgaard at biostat.ku.dk> To: <ggrothendieck at myway.com> Cc: <djw1005 at cam.ac.uk>, <R-help at stat.math.ethz.ch> Subject: Re: [R] Factor names & levels "Gabor Grothendieck" <ggrothendieck at myway.com> writes:> For it to be well defined, there would need to be a names > method and a names<- method for the factor class or else > the default methods would have to be able to handle factors.Not only that but other methods for factors need to know about the names and be able to modify them accordingly, e.g.> getS3method("levels<-","factor")function (x, value) { xlevs <- levels(x) if (is.list(value)) #something ... else { ... nlevs <- xlevs <- as.character(value) } factor(xlevs[x], levels = unique(nlevs)) } Here, xlevs[x] will not have the same names as x (it gets names from xlevs if anything) so you'd have to have extra code for setting the names on the result. (Rather interestingly, the factor() function does explicitly retain names, so there are not quite as many places where they will be lost as I would have expected.) -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
> If > its desired that name act on factors in the same way that names > act on vectors and lists then the methods I provided would not > be correct and, as Peter points out, the other factor methods > would have to be examined, as well, to ensure that they all > work properly with names.Thank you for all your answers. Damon.