When I alter the levels of a factor, why does it alter the names too?
f <-
factor(c(A="one",B="two",C="one",D="one",E="three"),
levels=c("one","two","three"))
names(f)
-- gives [1] "A" "B" "C" "D"
"E"
levels(f) <- c("un","deux","trois")
names(f)
-- gives NULL
I'm using R 1.8.0 for Windows.
Damon.
names() is only defined for vectors and lists and factors are
neither. See ?vector and ?names for more info.
---
From: djw1005 at cam.ac.uk
Subject: [R] Factor names & levels
When I alter the levels of a factor, why does it alter the names too?
f <-
factor(c(A="one",B="two",C="one",D="one",E="three"),
levels=c("one","two","three"))
names(f)
-- gives [1] "A" "B" "C" "D"
"E"
levels(f) <- c("un","deux","trois")
names(f)
-- gives NULL
I'm using R 1.8.0 for Windows.
Damon.
> names() is only defined for vectors and lists and factors are > neither. See ?vector and ?names for more info.?vector tells me that factors are not vectors, but ?names does not tell me that names() is only defined for vectors and lists. If it were, how should I understand the following?> x <- factor(c("one","three")) > names(x) <- c("fred","jim") > names(x)[1] "fred" "jim"> class(x)[1] "factor"
I agree it may not be 100% clear but ?names does say
"The default methods get and set the '"names"' attribute
of a vector or list." and if you issue the command:
methods("names")
you find that the only non-default method is names.dist.
Date: Sun, 21 Dec 2003 17:54:25 +0000 (GMT)
From: Damon Wischik <djw1005 at cam.ac.uk>
To: Gabor Grothendieck <ggrothendieck at myway.com>
Cc: <R-help at stat.math.ethz.ch>
Subject: Re: [R] Factor names & levels
> names() is only defined for vectors and lists and factors are
> neither. See ?vector and ?names for more info.
?vector tells me that factors are not vectors, but ?names does not tell me
that names() is only defined for vectors and lists. If it were, how
should I understand the following?
> x <- factor(c("one","three"))
> names(x) <- c("fred","jim")
> names(x)
[1] "fred" "jim" > class(x)
[1] "factor"
The effect of names() on factors is undefined. The fact
that it coincidentally partially works on factors is just
chance.
For it to be well defined, there would need to be a names
method and a names<- method for the factor class or else
the default methods would have to be able to handle factors.
Its a bit dangerous to rely on the coincidental behavior of
functionality, but in the absence of explicit R support, if
using names with factors were important to you then you could
define your own methods like this:
"names<-.factor" <- function( x, value ) {
attr(x, "levels") <- value
x
}
names.factor <- function(x) attr( x, "levels" )
# with the above, this works:
x <- factor(c("one","three"))
names(x) <- c("fred","jim") # implicitly invokes
"names<-.factor"
names(x) # implicitly invokes names.factor
Hope this clears it up for you.
---
Date: Mon, 22 Dec 2003 00:38:14 +0000 (GMT)
From: Damon Wischik <djw1005 at cam.ac.uk>
To: Gabor Grothendieck <ggrothendieck at myway.com>
Cc: <R-help at stat.math.ethz.ch>
Subject: Re: [R] Factor names & levels
> I agree it may not be 100% clear but ?names does say
> "The default methods get and set the '"names"'
attribute
> of a vector or list." and if you issue the command:
> methods("names")
> you find that the only non-default method is names.dist.
I still want to know how I should understand the following:
> x <- factor(c("one","three"))
> names(x) <- c("fred","jim")
> names(x)
[1] "fred" "jim"
class(x)
[1] "factor"
Given that names seems to work on factors, I can see two possibilities:
1. It is a bug that it acts as it does;
2. the default method does what it says in the help page, but also does
more than just this.
I don't know enough to look at the source code to find out what is going
on.
Damon.
Based on Peter's response, I think I may have misinterpreted
Damon's query. The methods I displayed in my last post in
this thread were intended to make name a synonym for level. If
its desired that name act on factors in the same way that names
act on vectors and lists then the methods I provided would not
be correct and, as Peter points out, the other factor methods
would have to be examined, as well, to ensure that they all
work properly with names.
I do have one other idea in terms of a workaround. You could
represent your factor as a one column data frame. The data
frame could then have row names which could be interpreted as
names of the factor.
For example,
f <- data.frame(f =
c("A","B","A","C"))
row.names(f) <- letters[1:4]
You can now refer to the factor as f$f and the names as row.names(f).
For example,
> f <- data.frame(f =
factor(c("A","B","A","C")))
> row.names(f) <- letters[1:4]
> f
f
a A
b B
c A
d C> row.names(f)
[1] "a" "b" "c"
"d"> f$f
[1] A B A C
Levels: A B C
This is all officially supported by R so it should not get you
into trouble although it does require that your program
interpret it accordingly.
---
Date: 22 Dec 2003 02:30:52 +0100
From: Peter Dalgaard <p.dalgaard at biostat.ku.dk>
To: <ggrothendieck at myway.com>
Cc: <djw1005 at cam.ac.uk>, <R-help at stat.math.ethz.ch>
Subject: Re: [R] Factor names & levels
"Gabor Grothendieck" <ggrothendieck at myway.com> writes:
> For it to be well defined, there would need to be a names
> method and a names<- method for the factor class or else
> the default methods would have to be able to handle factors.
Not only that but other methods for factors need to know about the
names and be able to modify them accordingly, e.g.
> getS3method("levels<-","factor")
function (x, value)
{
xlevs <- levels(x)
if (is.list(value)) #something
...
else {
...
nlevs <- xlevs <- as.character(value)
}
factor(xlevs[x], levels = unique(nlevs))
}
Here, xlevs[x] will not have the same names as x (it gets names from
xlevs if anything) so you'd have to have extra code for setting the
names on the result.
(Rather interestingly, the factor() function does explicitly retain
names, so there are not quite as many places where they will be lost
as I would have expected.)
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
> If > its desired that name act on factors in the same way that names > act on vectors and lists then the methods I provided would not > be correct and, as Peter points out, the other factor methods > would have to be examined, as well, to ensure that they all > work properly with names.Thank you for all your answers. Damon.