Erik Iverson
2008-May-21 19:32 UTC
[Rd] rbind.data.frame drops attributes for factor variables
Dear R-devel -
I noticed that when I rbind two data.frames together, factor variables
lose their attributes in the resulting data.frame while numeric
variables do not.
As an example, create two data.frames, t1 and t2, with two variables
each. Give each variable an attribute called "label", and then
perform
the rbind and look at the resulting structure.
#### EXAMPLE R CODE #####
t1 <- data.frame(subject = 1:4, trt =
factor(c("A","B","B","A")))
attr(t1$trt, "label") <- "Trt Label"
attr(t1$subject, "label") <- "Subject Label"
str(t1)
t2 <- data.frame(subject = 5:8, trt =
factor(c("A","A","A","A")))
attr(t2$trt, "label") <- "Trt Label"
attr(t2$subject, "label") <- "Subject Label"
str(t2)
str(rbind(t1, t2))
#### END EXAMPLE R CODE #####
The output of the last line of code is:
'data.frame': 8 obs. of 2 variables:
$ subject: atomic 1 2 3 4 5 6 7 8
..- attr(*, "label")= chr "Subject Label"
$ trt : Factor w/ 2 levels "A","B": 1 2 2 1 1 1 1 1
I do not see this documented anywhere in ?rbind, and do not know if it
is intended. It looks like the factor loses its attributes in
rbind.data.frame due to a call to as.vector. Of course, as.vector is
documented to drop the attributes of atomic vector types. I do not know
if this qualifies as a bug or intended since it is not stated in ?rbind
what will happen. sessionInfo() is below.
Best,
Erik Iverson
sessionInfo()
R version 2.7.0 (2008-04-22)
i686-pc-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
Possibly Parallel Threads
- using survexp and ratetable with coxph object that includes a factor term
- rbind.data.frame drops attributes for factor variables
- With an example - Re: rbind.data.frame drops attributes for factor variables
- mapply, coxph, and model formula
- wrapper for coxph with a subset argument
