That is not how you are intended to put character strings in data frames
in S. Rather, there is
A <- data.frame(a=1, b=I("A"))
B <- data.frame(a=2, b=I("B"))
AB <- rbind(A,B)
etc works (at least in R-devel)
Using $ on data frame is underhand, and avoids some of the consistency
checks.
We are planning to use I("foo") to put the column in as a character
column and use it consistently, but for 1.8.0 not 1.7.x
On Mon, 31 Mar 2003, Spencer Graves wrote:
> "rbind(A, B)" converts character columns of A and B to factors.
This
> means that "A <- rbind(A, B)" generates NAs unless the
character strings
> in B are already levels of the corresponding columns of A.
>
> I''ve got a work-around, but I''m not happy with it. What
do you suggest?
>
> Example:
>
> > A <- data.frame(a=1)
> > A$b <- "A"
> > B <- data.frame(a=2)
> > B$b <- "B"
> > sapply(A, data.class)
> a b
> "numeric" "character"
> > AB <- rbind(A,B)
> > sapply(AB, data.class)
> a b
> "numeric" "factor"
> > C. <- data.frame(a=3)
> > C.$b <- "C"
> > rbind(AB, C.)
> a b
> 1 1 A
> 11 2 B
> 111 3 <NA>
> Warning message:
> invalid factor level, NAs generated in: "[<-.factor"(*tmp*,
ri, value =
> "C")
> > sapply(rbind(AB, C.), data.class)
> a b
> "numeric" "factor"
> Warning message:
> invalid factor level, NAs generated in: "[<-.factor"(*tmp*,
ri, value =
> "C")
>
> Thanks,
> Spencer Graves
> p.s. This example produces the desired result in S-Plus 2000 and 6.1
> Professional for Windows 2000.
I am not at clear sure that is intentional, though.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595