Hi I'm seeing some "odd" behaviour with cbind(). My code is:> cat <- read.table("cogs_category.txt", sep="\t", header=TRUE,quote=NULL, colClasses="character")> colnames(cat)[1] "Code" "Description"> is.factor(cat$Code)[1] FALSE> is.factor(cat$Description)[1] FALSE> is.factor(rainbow(nrow(cat)))[1] FALSE> cat <- cbind(cat,"Color"=rainbow(nrow(cat))) > is.factor(cat$Color)[1] TRUE> ?cbindI read a text file in which has two columns, Code and Description. Neither of these are factors. I want to add a column of colours to the data frame using rainbow(). The rainbow function also does not return a factor. However, if I cbind my data frame (which has no factors in it) and the results of rainbow() (which is a vector, not a factor), then for some reason the new column is a factor...?? Mick Michael Watson Head of Informatics Institute for Animal Health, Compton Laboratory, Compton, Newbury, Berkshire RG20 7NN UK Phone : +44 (0)1635 578411 ext. 2535 Mobile: +44 (0)7990 827831 E-mail: michael.watson at bbsrc.ac.uk
cat is a data.frame,
so cbind is use for a data.frame
and
?data.frame tell us that:
Character variables passed to 'data.frame' are converted
to factor columns unless protected by 'I'.
PS : it is not good ides to call your data.frame cat as there is a cat
function.
At 09:19 10/12/2004, michael watson (IAH-C) wrote:>Hi
>
>I'm seeing some "odd" behaviour with cbind(). My code is:
>
> > cat <- read.table("cogs_category.txt",
sep="\t", header=TRUE,
>quote=NULL, colClasses="character")
> > colnames(cat)
>[1] "Code" "Description"
> > is.factor(cat$Code)
>[1] FALSE
> > is.factor(cat$Description)
>[1] FALSE
> > is.factor(rainbow(nrow(cat)))
>[1] FALSE
> > cat <- cbind(cat,"Color"=rainbow(nrow(cat)))
> > is.factor(cat$Color)
>[1] TRUE
> > ?cbind
>
>I read a text file in which has two columns, Code and Description.
>Neither of these are factors. I want to add a column of colours to the
>data frame using rainbow(). The rainbow function also does not return a
>factor. However, if I cbind my data frame (which has no factors in it)
>and the results of rainbow() (which is a vector, not a factor), then for
>some reason the new column is a factor...??
>
>Mick
>
>
>Michael Watson
>Head of Informatics
>Institute for Animal Health,
>Compton Laboratory,
>Compton,
>Newbury,
>Berkshire RG20 7NN
>UK
>
>Phone : +44 (0)1635 578411 ext. 2535
>Mobile: +44 (0)7990 827831
>E-mail: michael.watson at bbsrc.ac.uk
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
St??phane DRAY
--------------------------------------------------------------------------------------------------
D??partement des Sciences Biologiques
Universit?? de Montr??al, C.P. 6128, succursale centre-ville
Montr??al, Qu??bec H3C 3J7, Canada
Tel : (514) 343-6111 poste 1233 Fax : (514) 343-2293
E-mail : stephane.dray at umontreal.ca
--------------------------------------------------------------------------------------------------
Web http://www.steph280.freesurf.fr/
This is of the nature of an FAQ. Data frames coerce character vectors into factors. If you want a character vector to stay that way (and not become a factor) wrap in up in ``I()'': cat <- cbind(cat,Color=I(rainbow(nrow(cat)))) (There's no need to quote the name ``Color'' in the foregoing.) cheers, Rolf Turner rolf at math.unb.ca
Probably you called the build-in rainwbow-function, which returns a string.>str(rainbow(10))chr "FF0000" Dieter Menne
michael watson (IAH-C <michael.watson <at> bbsrc.ac.uk> writes:
:
: Hi
:
: I'm seeing some "odd" behaviour with cbind(). My code is:
:
: > cat <- read.table("cogs_category.txt", sep="\t",
header=TRUE,
: quote=NULL, colClasses="character")
: > colnames(cat)
: [1] "Code" "Description"
: > is.factor(cat$Code)
: [1] FALSE
: > is.factor(cat$Description)
: [1] FALSE
: > is.factor(rainbow(nrow(cat)))
: [1] FALSE
: > cat <- cbind(cat,"Color"=rainbow(nrow(cat)))
: > is.factor(cat$Color)
: [1] TRUE
: > ?cbind
:
: I read a text file in which has two columns, Code and Description.
: Neither of these are factors. I want to add a column of colours to the
: data frame using rainbow(). The rainbow function also does not return a
: factor. However, if I cbind my data frame (which has no factors in it)
: and the results of rainbow() (which is a vector, not a factor), then for
: some reason the new column is a factor...??
Others have already explained the problem and given what is likely
the best solution but here is one other idea, just in case.
You may require a data frame depending on what you want to do but
if you don't then you could alternately use a character matrix
since that won't result in any conversions to factor.
Lets call the data frame from read.table, Cat.df, and our
matrix, Cat.m. cat is not wrong but its confusing
since there is a common R function called cat. Now we can
write the following and don't have to worry about factors:
Cat.df <- read.table(...)
# create a character matrix and cbind Colors to it
Cat.m <- cbind(as.matrix(Cat.df), Color = rainbow(nrow(Cat.df)))
If you do find you need a data frame later you can convert it back
like this:
Cat.df <- as.data.frame(Cat.m)
Cat.df[] <- Cat.m # clobber factors with character data