Don't use cbind() -- it forces everything into a single type, here
string, which in turn becomes factor.
Simply,
data.frame(a, b, c)
Like David mentioned a few days ago, I have no idea who is promoting
this data.frame(cbind(...)) idiom, but it's a terrible idea (albeit
one that seems to be very frequent over the last few weeks)
Michael
On Tue, Apr 10, 2012 at 10:33 AM, Anser Chen <anser.chen at gmail.com>
wrote:> Complete newbie to R -- struggling with something which should be pretty
> basic. Trying to create a simple data set (which I gather R refers to as a
> data.frame). So
>
>> ?a <- c(1,2,3,4,5);
>> ?b <- c(0.3,0.4,0.5,0,6,0.7);
>
> Stick the two together into a data frame (call test) using cbind
>
>> test <- data.frame(cbind(a,b))
>
> Seems to do the trick:
>
>> test
> ?a ? b
> 1 1 0.3
> 2 2 0.4
> 3 3 0.5
> 4 4 0.6
> 5 5 0.7
>>
>
> Confirm that each variable is numeric:
>
>> is.numeric(test$a)
> [1] TRUE
>> is.numeric(test$b)
> [1] TRUE
>
>
> OK, so far so good. But, now I want to merge in a vector of characters:
>
>> c <-
c('y1","y2","y3","y4","y5")
>
> Confirm that this is string:
>
>> is.numeric(c);
> [1] FALSE
>
> cbind c into the data frame:
>
>> ?test <- data.frame(cbind(a,b,c))
>
> Looks like everything is in place:
>
>> test
> ?a ? b ?c
> 1 1 0.3 y1
> 2 2 0.4 y2
> 3 3 0.5 y3
> 4 4 0.6 y4
> 5 5 0.7 y5
>
> Except that it seems as if the moment I cbind in a character vector, it
> changes numeric data to string:
>
>> is.numeric(test$a)
> [1] FALSE
>> is.numeric(test$b)
> [1] FALSE
>
> which would explain why the operations I'm trying to perform on
elements of
> a and b columns are failing. If I look at the structure of the data.frame,
> I see that in fact *all* the variables are being entered as
"factors".
>
>> str(test)
> 'data.frame': ? 5 obs. of ?3 variables:
> ?$ a: Factor w/ 5 levels
"1","2","3","4",..: 1 2 3 4 5
> ?$ b: Factor w/ 5 levels
"0.3","0.4","0.5",..: 1 2 3 4 5
> ?$ c: Factor w/ 5 levels "y1","y2","y3",..: 1
2 3 4 5
>
> But, if I try
>
> ?test <- data.frame(cbind(a,b))
>> str(test)
> 'data.frame': ? 5 obs. of ?2 variables:
> ?$ a: num ?1 2 3 4 5
> ?$ b: num ?0.3 0.4 0.5 0.6 0.7
>
> a and b are coming back as numeric. So, why does cbind'ing a column of
> character variables change everything else? And, more to the point, what do
> I need to do to 'correct' the problem (i.e., stop this from
happening).
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.