thr3ads.net - R devel - [Rd] 'by' with one-dimensional array [Nov 2008]

If this information is useful, please help other people find it:
Share via:

Patrick Burns

2008-Nov-16 19:22 UTC

[Rd] 'by' with one-dimensional array

I've played a bit with the problem that Jeff Laake
reported on R-help:

# create data:
jl <- data.frame(x=rep(1, 3), y=tapply(1:9,
rep(c('A','B','C'), each=3),
sum))
jl2 <- jl
jl2$y <- as.numeric(jl2$y)


# do the test:

 > tapply(jl$y, jl$x, length)
1
3
 > tapply(jl2$y, jl2$x, length)
1
3
 > by(jl2$y, jl2$x, length)
jl2$x: 1
[1] 3
 > by(jl$y, jl$x, length)
INDICES: 1
[1] 1

The result of 'by' on the 1-dimensional array is
giving the correct answer to a question that I don't
think many of us thought we were asking.

Once upon a time 'by' gave 3 as the answer in both
situations.

'by.default' used to be a one-liner, but now decides
what to do based on 'length(dim(data))'.

This specific problem goes away if the line:

if (length(dim(data)))

is replaced by:

if(length(dim(data)) > 1)

But I don't know what other mischief such a change
would make.

Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

R devel - Nov 2008 - 'by' with one-dimensional array

[Rd] 'by' with one-dimensional array