jaroslaw.w.tuszynski@saic.com
2004-Sep-09 20:01 UTC
[Rd] function "apply" and 3D arrays (PR#7221)
Full_Name: jarek tuszynski Version: 1.8.1 OS: windows 2000 Submission from: (NULL) (198.151.13.10) Example code:> a=array(1:27, c(3,3,3)) > apply(a,2, var)[,1] [,2] [,3] [1,] 1 1 1 [2,] 1 1 1 [3,] 1 1 1 [4,] 1 1 1 [5,] 1 1 1 [6,] 1 1 1 [7,] 1 1 1 [8,] 1 1 1 [9,] 1 1 1> apply(a,2, mean)[1] 11 14 17> apply(a,2, sd)[,1] [,2] [,3] [1,] 1 1 1 [2,] 1 1 1 [3,] 1 1 1 I could not figure out from the documentation how MARGIN argument of function "apply" works in case of arrays with dimentions larger than 2, so I created the above test code. I still do not know how it suppose to work but I should not get the results with different dimentions, while calculating var and sd. Hope this helps, Jarek
The `problem', I think, is your expectation that the output of apply(a, 2, var) to be of the same dimension as apply(a, 2, sd) if a has dimensions > 2. Note that:> sd(matrix(1:9, 3, 3))[1] 1 1 1> var(matrix(1:9, 3, 3))[,1] [,2] [,3] [1,] 1 1 1 [2,] 1 1 1 [3,] 1 1 1 because var(), when given a matrix, returns the variance-covariance matrix of the columns. The output of sd() can be a bit surprising:> sd(array(1:27, rep(3, 3)))[1] 7.937254 This is because sd() looks like:> sdfunction (x, na.rm = FALSE) { if (is.matrix(x)) apply(x, 2, sd, na.rm = na.rm) else if (is.vector(x)) sqrt(var(x, na.rm = na.rm)) else if (is.data.frame(x)) sapply(x, sd, na.rm = na.rm) else sqrt(var(as.vector(x), na.rm = na.rm)) } So for matrices and data frames, sd() returns the column standard deviations. Otherwise it treats the input as a vector and compute the SD. Andy> From: jaroslaw.w.tuszynski@saic.com > > Full_Name: jarek tuszynski > Version: 1.8.1 > OS: windows 2000 > Submission from: (NULL) (198.151.13.10) > > > Example code: > > a=array(1:27, c(3,3,3)) > > apply(a,2, var) > [,1] [,2] [,3] > [1,] 1 1 1 > [2,] 1 1 1 > [3,] 1 1 1 > [4,] 1 1 1 > [5,] 1 1 1 > [6,] 1 1 1 > [7,] 1 1 1 > [8,] 1 1 1 > [9,] 1 1 1 > > apply(a,2, mean) > [1] 11 14 17 > > apply(a,2, sd) > [,1] [,2] [,3] > [1,] 1 1 1 > [2,] 1 1 1 > [3,] 1 1 1 > > I could not figure out from the documentation how MARGIN > argument of function > "apply" works in case of arrays with dimentions larger than > 2, so I created the > above test code. I still do not know how it suppose to work > but I should not get > the results with different dimentions, while calculating var and sd. > > Hope this helps, > > Jarek > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >
[This appears to have been misposted to r-bugs -- there is no bug reported here.] It's easiest to explore the behavior of high-d apply when your test array has different extents on each dimension. That way you can easily see what's happening with each dimension. Also, when experimenting, use FUN=function(x) browser() to see what is getting passed to FUN. The way I remember what apply() does is that the MAR= argument specifies the dimensions to be kept in the result. Iin apply(x, MAR, FUN), FUN gets passed an object with dimensions dim(x)[-MAR]. So, in your example, FUN is getting passed a matrix. The reason you get differently shaped results with mean, sd, and var is that each of these gives a differently shaped result (scalar, vector, and matrix) when given a matrix. > a <- array(1:24, c(2,3,4)) > apply(a, 2, function(x) browser()) Called from: FUN(array(newX[, i], d.call, dn.call), ...) Browse[1]> dim(x) [1] 2 4 Browse[1]> x [,1] [,2] [,3] [,4] [1,] 1 7 13 19 [2,] 2 8 14 20 Browse[1]> mean(x) [1] 10.5 Browse[1]> sd(x) [1] 0.7071068 0.7071068 0.7071068 0.7071068 Browse[1]> var(x) [,1] [,2] [,3] [,4] [1,] 0.5 0.5 0.5 0.5 [2,] 0.5 0.5 0.5 0.5 [3,] 0.5 0.5 0.5 0.5 [4,] 0.5 0.5 0.5 0.5 hope this helps, Tony Plate At Thursday 12:01 PM 9/9/2004, jaroslaw.w.tuszynski@saic.com wrote:>Full_Name: jarek tuszynski >Version: 1.8.1 >OS: windows 2000 >Submission from: (NULL) (198.151.13.10) > > >Example code: > > a=array(1:27, c(3,3,3)) > > apply(a,2, var) > [,1] [,2] [,3] > [1,] 1 1 1 > [2,] 1 1 1 > [3,] 1 1 1 > [4,] 1 1 1 > [5,] 1 1 1 > [6,] 1 1 1 > [7,] 1 1 1 > [8,] 1 1 1 > [9,] 1 1 1 > > apply(a,2, mean) >[1] 11 14 17 > > apply(a,2, sd) > [,1] [,2] [,3] >[1,] 1 1 1 >[2,] 1 1 1 >[3,] 1 1 1 > >I could not figure out from the documentation how MARGIN argument of function >"apply" works in case of arrays with dimentions larger than 2, so I >created the >above test code. I still do not know how it suppose to work but I should >not get >the results with different dimentions, while calculating var and sd. > >Hope this helps, > >Jarek > >______________________________________________ >R-devel@stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-devel
jaroslaw.w.tuszynski@saic.com writes:> Full_Name: jarek tuszynski > Version: 1.8.1 > OS: windows 2000 > Submission from: (NULL) (198.151.13.10) > > > Example code: > > a=array(1:27, c(3,3,3)) > > apply(a,2, var) > [,1] [,2] [,3] > [1,] 1 1 1 > [2,] 1 1 1 > [3,] 1 1 1 > [4,] 1 1 1 > [5,] 1 1 1 > [6,] 1 1 1 > [7,] 1 1 1 > [8,] 1 1 1 > [9,] 1 1 1 > > apply(a,2, mean) > [1] 11 14 17 > > apply(a,2, sd) > [,1] [,2] [,3] > [1,] 1 1 1 > [2,] 1 1 1 > [3,] 1 1 1 > > I could not figure out from the documentation how MARGIN argument of function > "apply" works in case of arrays with dimentions larger than 2, so I created the > above test code. I still do not know how it suppose to work but I should not get > the results with different dimentions, while calculating var and sd.All in accordance with documentation, i.e. no bug. The fundamental issue is that> mean(a[,3,])[1] 17> var(a[,3,])[,1] [,2] [,3] [1,] 1 1 1 [2,] 1 1 1 [3,] 1 1 1> sd(a[,3,])[1] 1 1 1 In particular, var() gives a covariance matrix, but there's no such thing as an sd() matrix, so you get the columnwise sd(). -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907