Dear experts, I wanted to signal a peculiar, unexpected behaviour of 'apply'. It is not a bug, it is per spec, but it is so counterintuitive that I thought it could be interesting. I have an array, let's say "test", dim=c(7,5).> test <- array(1:35, dim=c(7, 5)) > test[,1] [,2] [,3] [,4] [,5] [1,] 1 8 15 22 29 [2,] 2 9 16 23 30 [3,] 3 10 17 24 31 [4,] 4 11 18 25 32 [5,] 5 12 19 26 33 [6,] 6 13 20 27 34 [7,] 7 14 21 28 35 I want a new array where the content of the rows (columns) are permuted, differently per row (per column) Let's start with the columns, i.e. the second MARGIN of the array:> test.m2 <- apply(test, 2, sample) > test.m2[,1] [,2] [,3] [,4] [,5] [1,] 1 10 18 23 32 [2,] 7 9 16 25 30 [3,] 6 14 17 22 33 [4,] 4 11 15 24 34 [5,] 2 12 21 28 31 [6,] 5 8 20 26 29 [7,] 3 13 19 27 35 perfect. That was exactly what I wanted: the content of each column is shuffled, and differently for each column. However, if I use the same with the rows (MARGIIN = 1), the output is transposed!> test.m1 <- apply(test, 1, sample) > test.m1[,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] 1 2 3 4 5 13 21 [2,] 22 30 17 18 19 20 35 [3,] 15 23 24 32 26 27 14 [4,] 29 16 31 25 33 34 28 [5,] 8 9 10 11 12 6 7 In other words, I wanted to permute the content of the rows of "test", and I expected to see in the output, well, the shuffled rows as rows, not as column! I would respectfully suggest to make this behavior more explicit in the documentation. Kind regards, Luca Nanetti -- ______________ Luca Nanetti, MSc, MRI University Medical Center Groningen Neuroimaging Center Groningen Groningen, The Netherlands Tel: +31 50 363 4733 [[alternative HTML version deleted]]
On 13-05-14 4:52 AM, Luca Nanetti wrote:> Dear experts, > > I wanted to signal a peculiar, unexpected behaviour of 'apply'. It is not a > bug, it is per spec, but it is so counterintuitive that I thought it could > be interesting. > > I have an array, let's say "test", dim=c(7,5). > >> test <- array(1:35, dim=c(7, 5)) >> test > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 8 15 22 29 > [2,] 2 9 16 23 30 > [3,] 3 10 17 24 31 > [4,] 4 11 18 25 32 > [5,] 5 12 19 26 33 > [6,] 6 13 20 27 34 > [7,] 7 14 21 28 35 > > I want a new array where the content of the rows (columns) are permuted, > differently per row (per column) > > Let's start with the columns, i.e. the second MARGIN of the array: >> test.m2 <- apply(test, 2, sample) >> test.m2 > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 10 18 23 32 > [2,] 7 9 16 25 30 > [3,] 6 14 17 22 33 > [4,] 4 11 15 24 34 > [5,] 2 12 21 28 31 > [6,] 5 8 20 26 29 > [7,] 3 13 19 27 35 > > perfect. That was exactly what I wanted: the content of each column is > shuffled, and differently for each column. > However, if I use the same with the rows (MARGIIN = 1), the output is > transposed! > >> test.m1 <- apply(test, 1, sample) >> test.m1 > > [,1] [,2] [,3] [,4] [,5] [,6] [,7] > [1,] 1 2 3 4 5 13 21 > [2,] 22 30 17 18 19 20 35 > [3,] 15 23 24 32 26 27 14 > [4,] 29 16 31 25 33 34 28 > [5,] 8 9 10 11 12 6 7 > > In other words, I wanted to permute the content of the rows of "test", and > I expected to see in the output, well, the shuffled rows as rows, not as > column! > > I would respectfully suggest to make this behavior more explicit in the > documentation.It's is already very explicit: "If each call to FUN returns a vector of length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1." In your first case, sample is applied to columns, and returns length 7 results, so the shape of the final result is c(7, 5). In the second case it is applied to rows, and returns length 5 results, so the shape is c(5, 7). Duncan Murdoch
On Tue, 14 May 2013, Luca Nanetti <luca.nanetti at gmail.com> writes:> Dear experts, > > I wanted to signal a peculiar, unexpected behaviour of 'apply'. It is not a > bug, it is per spec, but it is so counterintuitive that I thought it could > be interesting. > > I have an array, let's say "test", dim=c(7,5). > >> test <- array(1:35, dim=c(7, 5)) >> test > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 8 15 22 29 > [2,] 2 9 16 23 30 > [3,] 3 10 17 24 31 > [4,] 4 11 18 25 32 > [5,] 5 12 19 26 33 > [6,] 6 13 20 27 34 > [7,] 7 14 21 28 35 > > I want a new array where the content of the rows (columns) are permuted, > differently per row (per column) > > Let's start with the columns, i.e. the second MARGIN of the array: >> test.m2 <- apply(test, 2, sample) >> test.m2 > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 10 18 23 32 > [2,] 7 9 16 25 30 > [3,] 6 14 17 22 33 > [4,] 4 11 15 24 34 > [5,] 2 12 21 28 31 > [6,] 5 8 20 26 29 > [7,] 3 13 19 27 35 > > perfect. That was exactly what I wanted: the content of each column is > shuffled, and differently for each column. > However, if I use the same with the rows (MARGIIN = 1), the output is > transposed! > >> test.m1 <- apply(test, 1, sample) >> test.m1 > > [,1] [,2] [,3] [,4] [,5] [,6] [,7] > [1,] 1 2 3 4 5 13 21 > [2,] 22 30 17 18 19 20 35 > [3,] 15 23 24 32 26 27 14 > [4,] 29 16 31 25 33 34 28 > [5,] 8 9 10 11 12 6 7 > > In other words, I wanted to permute the content of the rows of "test", and > I expected to see in the output, well, the shuffled rows as rows, not as > column! > > I would respectfully suggest to make this behavior more explicit in the > documentation.As you said yourself, this behaviour is documented: "If each call to ?FUN? returns a vector of length ?n?, then ?apply? returns an array of dimension ?c(n, dim(X)[MARGIN])? [...]" And it has nothing to do with 'sample'. Try: apply(test, 1, function(x) x) apply(test, 2, function(x) x) The result is only counterintuitive (or inconvenient, perhaps) in the special case in which apply is supposed to return an array that has the same dimension as its input. More generally, you will do something like apply(test, 1, median) apply(test, 1, function(x) list(sum = sum(x), values = x)) and in such cases, apply does not return an array. -- Enrico Schumann Lucerne, Switzerland http://enricoschumann.net
Hello, The problem is that apply returns the results vector by vector and in R vectors are column vectors. This is not exclusive of apply with sample as the function to be called, but of apply in general. Try, for instance apply(test, 1, identity) # transposes the array The rows are returned as column vectors. And you should expect this behavior from apply with MARGIN = 1. And this is in fact documented, in the Value section of ?apply: Value If each call to FUN returns a vector of length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1. The length of the returned vector is the number of rows and the number of columns is the dim corresponding to MARGIN... Hope this helps, Rui Barradas Em 14-05-2013 09:52, Luca Nanetti escreveu:> Dear experts, > > I wanted to signal a peculiar, unexpected behaviour of 'apply'. It is not a > bug, it is per spec, but it is so counterintuitive that I thought it could > be interesting. > > I have an array, let's say "test", dim=c(7,5). > >> test <- array(1:35, dim=c(7, 5)) >> test > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 8 15 22 29 > [2,] 2 9 16 23 30 > [3,] 3 10 17 24 31 > [4,] 4 11 18 25 32 > [5,] 5 12 19 26 33 > [6,] 6 13 20 27 34 > [7,] 7 14 21 28 35 > > I want a new array where the content of the rows (columns) are permuted, > differently per row (per column) > > Let's start with the columns, i.e. the second MARGIN of the array: >> test.m2 <- apply(test, 2, sample) >> test.m2 > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 10 18 23 32 > [2,] 7 9 16 25 30 > [3,] 6 14 17 22 33 > [4,] 4 11 15 24 34 > [5,] 2 12 21 28 31 > [6,] 5 8 20 26 29 > [7,] 3 13 19 27 35 > > perfect. That was exactly what I wanted: the content of each column is > shuffled, and differently for each column. > However, if I use the same with the rows (MARGIIN = 1), the output is > transposed! > >> test.m1 <- apply(test, 1, sample) >> test.m1 > > [,1] [,2] [,3] [,4] [,5] [,6] [,7] > [1,] 1 2 3 4 5 13 21 > [2,] 22 30 17 18 19 20 35 > [3,] 15 23 24 32 26 27 14 > [4,] 29 16 31 25 33 34 28 > [5,] 8 9 10 11 12 6 7 > > In other words, I wanted to permute the content of the rows of "test", and > I expected to see in the output, well, the shuffled rows as rows, not as > column! > > I would respectfully suggest to make this behavior more explicit in the > documentation. > > Kind regards, > Luca Nanetti >
Gabor Grothendieck
2013-May-14 10:28 UTC
[R] Unexpected behavior of "apply" when FUN=sample
On Tue, May 14, 2013 at 4:52 AM, Luca Nanetti <luca.nanetti at gmail.com> wrote:> Dear experts, > > I wanted to signal a peculiar, unexpected behaviour of 'apply'. It is not a > bug, it is per spec, but it is so counterintuitive that I thought it could > be interesting. > > I have an array, let's say "test", dim=c(7,5). > >> test <- array(1:35, dim=c(7, 5)) >> test > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 8 15 22 29 > [2,] 2 9 16 23 30 > [3,] 3 10 17 24 31 > [4,] 4 11 18 25 32 > [5,] 5 12 19 26 33 > [6,] 6 13 20 27 34 > [7,] 7 14 21 28 35 > > I want a new array where the content of the rows (columns) are permuted, > differently per row (per column) > > Let's start with the columns, i.e. the second MARGIN of the array: >> test.m2 <- apply(test, 2, sample) >> test.m2 > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 10 18 23 32 > [2,] 7 9 16 25 30 > [3,] 6 14 17 22 33 > [4,] 4 11 15 24 34 > [5,] 2 12 21 28 31 > [6,] 5 8 20 26 29 > [7,] 3 13 19 27 35 > > perfect. That was exactly what I wanted: the content of each column is > shuffled, and differently for each column. > However, if I use the same with the rows (MARGIIN = 1), the output is > transposed! > >> test.m1 <- apply(test, 1, sample) >> test.m1 > > [,1] [,2] [,3] [,4] [,5] [,6] [,7] > [1,] 1 2 3 4 5 13 21 > [2,] 22 30 17 18 19 20 35 > [3,] 15 23 24 32 26 27 14 > [4,] 29 16 31 25 33 34 28 > [5,] 8 9 10 11 12 6 7 > > In other words, I wanted to permute the content of the rows of "test", and > I expected to see in the output, well, the shuffled rows as rows, not as > column! > > I would respectfully suggest to make this behavior more explicit in the > documentation.aaply in the plyr package works in the way you expected. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
This is Circle 8.1.47 of 'The R Inferno'. http://www.burns-stat.com/documents/books/the-r-inferno/ Pat On 14/05/2013 09:52, Luca Nanetti wrote:> Dear experts, > > I wanted to signal a peculiar, unexpected behaviour of 'apply'. It is not a > bug, it is per spec, but it is so counterintuitive that I thought it could > be interesting. > > I have an array, let's say "test", dim=c(7,5). > >> test <- array(1:35, dim=c(7, 5)) >> test > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 8 15 22 29 > [2,] 2 9 16 23 30 > [3,] 3 10 17 24 31 > [4,] 4 11 18 25 32 > [5,] 5 12 19 26 33 > [6,] 6 13 20 27 34 > [7,] 7 14 21 28 35 > > I want a new array where the content of the rows (columns) are permuted, > differently per row (per column) > > Let's start with the columns, i.e. the second MARGIN of the array: >> test.m2 <- apply(test, 2, sample) >> test.m2 > > [,1] [,2] [,3] [,4] [,5] > [1,] 1 10 18 23 32 > [2,] 7 9 16 25 30 > [3,] 6 14 17 22 33 > [4,] 4 11 15 24 34 > [5,] 2 12 21 28 31 > [6,] 5 8 20 26 29 > [7,] 3 13 19 27 35 > > perfect. That was exactly what I wanted: the content of each column is > shuffled, and differently for each column. > However, if I use the same with the rows (MARGIIN = 1), the output is > transposed! > >> test.m1 <- apply(test, 1, sample) >> test.m1 > > [,1] [,2] [,3] [,4] [,5] [,6] [,7] > [1,] 1 2 3 4 5 13 21 > [2,] 22 30 17 18 19 20 35 > [3,] 15 23 24 32 26 27 14 > [4,] 29 16 31 25 33 34 28 > [5,] 8 9 10 11 12 6 7 > > In other words, I wanted to permute the content of the rows of "test", and > I expected to see in the output, well, the shuffled rows as rows, not as > column! > > I would respectfully suggest to make this behavior more explicit in the > documentation. > > Kind regards, > Luca Nanetti >-- Patrick Burns pburns at pburns.seanet.com twitter: @burnsstat @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of: 'Impatient R' 'The R Inferno' 'Tao Te Programming')