Hi Martin, Fair enough for R functions in general. But the behaviour of apply violates the expectation that apply(m, 1, fun) calls fun n times when m has n rows. That seems pretty basic. Also, I understand from your argument why it makes sense to call apply and return a special result (presumably NULL) for an empty argument; but why should apply call fun? Cheers David On Mon, 30 Jul 2018 at 08:41, Martin Maechler <maechler at stat.math.ethz.ch> wrote:> >>>>> David Hugh-Jones > >>>>> on Mon, 30 Jul 2018 05:33:19 +0100 writes: > > > Forgive me if this has been asked many times before, but I > > couldn't find anything on the mailing lists. > > > I'd expect apply(m, 1, foo) not to call `foo` if m is a > > matrix with zero rows. In fact: > > > m <- matrix(NA, 0, 5) > > apply(m, 1, function (x) {cat("Called...\n"); print(x)}) > > ## Called... > > ## [1] FALSE FALSE FALSE FALSE FALSE > > > > Similarly for apply(m, 2,...) if m has no columns. Is > > there a reason for this? > > Yes : > > The reverse is really true for almost all basic R functions: > > They *are* called and give an "empty" result automatically > when the main argument is empty. > > What you basicaly propose is to add an extra > > if(<length 0 input>) > return(<correspondingly formatted length-0 output>) > > to all R functions. While that makes sense for high-level R > functions that do a lot of things, this would really be a bad > idea in general : > > This would make all of these basic functions larger {more to maintain} and > slightly slower for all non-zero cases just to make them > slightly faster for the rare zero-length case. > > Martin Maechler > ETH Zurich and R core Team > > --Sent from Gmail Mobile [[alternative HTML version deleted]]
>>>>> David Hugh-Jones >>>>> on Mon, 30 Jul 2018 10:12:24 +0100 writes:> Hi Martin, Fair enough for R functions in general. But the > behaviour of apply violates the expectation that apply(m, > 1, fun) calls fun n times when m has n rows. That seems > pretty basic. Well, that expectation is obviously wrong ;-) see below > Also, I understand from your argument why it makes sense > to call apply and return a special result (presumably > NULL) for an empty argument; but why should apply call fun? > Cheers David The reason is seen e.g. in > apply(matrix(,0,3), 2, quantile) [,1] [,2] [,3] 0% NA NA NA 25% NA NA NA 50% NA NA NA 75% NA NA NA 100% NA NA NA > and that is documented (+/-) in the first paragraph of the 'Value:' section of help(apply) : > Value: > > If each call to ?FUN? returns a vector of length ?n?, then ?apply? > returns an array of dimension ?c(n, dim(X)[MARGIN])? if ?n > 1?. > If ?n? equals ?1?, ?apply? returns a vector if ?MARGIN? has length > 1 and an array of dimension ?dim(X)[MARGIN]? otherwise. If ?n? is > ?0?, the result has length 0 but not necessarily the ?correct? > dimension. To determine 'n', the function *is* called once even when length(X) == 0 It may indeed be would helpful to add this explicitly to the help page ( <R>/src/library/base/man/apply.Rd ). Can you propose a wording (in *.Rd if possible) ? With regards, Martin
On Mon, Jul 30, 2018 at 6:08 PM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:>>>>>> David Hugh-Jones >>>>>> on Mon, 30 Jul 2018 10:12:24 +0100 writes: > > > Hi Martin, Fair enough for R functions in general. But the > > behaviour of apply violates the expectation that apply(m, > > 1, fun) calls fun n times when m has n rows. That seems > > pretty basic. > > Well, that expectation is obviously wrong ;-) see below > > > Also, I understand from your argument why it makes sense > > to call apply and return a special result (presumably > > NULL) for an empty argument; but why should apply call fun? > > > Cheers David > > The reason is seen e.g. in > > > apply(matrix(,0,3), 2, quantile) > [,1] [,2] [,3] > 0% NA NA NA > 25% NA NA NA > 50% NA NA NA > 75% NA NA NA > 100% NA NA NA > >I don't think this example is relevant to what David is saying: matrix(,0,3) has three columns, so he would expect quantile() to be called 3 times, as it is. I think his question is why quantile() is called at all when the input has 0 rows, as in apply(matrix(,0,3), 1, quantile) # named numeric(0)> and that is documented (+/-) in the first paragraph of the > 'Value:' section of help(apply) : > > > Value: > > > > If each call to ?FUN? returns a vector of length ?n?, then ?apply? > > returns an array of dimension ?c(n, dim(X)[MARGIN])? if ?n > 1?. > > If ?n? equals ?1?, ?apply? returns a vector if ?MARGIN? has length > > 1 and an array of dimension ?dim(X)[MARGIN]? otherwise. If ?n? is > > ?0?, the result has length 0 but not necessarily the ?correct? > > dimension. > > > To determine 'n', the function *is* called once even when > length(X) == 0This part of the docs also doesn't seem applicable, and in fact seems incorrect: here we should have (according to the docs) n = length(quantile(logical(0))) # 5 but the result does not have dim == c(5, 0) as the docs suggest: dim(apply(matrix(,0,3), 1, quantile)) # NULL So the length of the result of calling FUN() seems to be ignored in this case, and as Emil points out, is only used to determine the mode of the result. I can't immediately think of an example where returning NULL instead would make a difference, but there may well be some. -Deepayan> It may indeed be would helpful to add this explicitly to the > help page ( <R>/src/library/base/man/apply.Rd ). > Can you propose a wording (in *.Rd if possible) ? > > With regards, > Martin > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
vapply has a mandatory FUN.VALUE argument which specifies the type and size of FUN's return value. This helps when you want to cover the 0-length case without 'if' statements. You can change your apply calls to vapply calls, but they will be a bit more complicated. E.g., change apply(X=myMatrix, MARGIN=2, FUN=quantile) to vapply(seq_len(ncol(myMatrix)), FUN=function(i)quantile(myMatrix[,i]), FUN.VALUE=numeric(5)) The latter will always return a 5-row by ncol(myMatrix) matrix. Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Jul 30, 2018 at 5:38 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:> >>>>> David Hugh-Jones > >>>>> on Mon, 30 Jul 2018 10:12:24 +0100 writes: > > > Hi Martin, Fair enough for R functions in general. But the > > behaviour of apply violates the expectation that apply(m, > > 1, fun) calls fun n times when m has n rows. That seems > > pretty basic. > > Well, that expectation is obviously wrong ;-) see below > > > Also, I understand from your argument why it makes sense > > to call apply and return a special result (presumably > > NULL) for an empty argument; but why should apply call fun? > > > Cheers David > > The reason is seen e.g. in > > > apply(matrix(,0,3), 2, quantile) > [,1] [,2] [,3] > 0% NA NA NA > 25% NA NA NA > 50% NA NA NA > 75% NA NA NA > 100% NA NA NA > > > > and that is documented (+/-) in the first paragraph of the > 'Value:' section of help(apply) : > > > Value: > > > > If each call to ?FUN? returns a vector of length ?n?, then ?apply? > > returns an array of dimension ?c(n, dim(X)[MARGIN])? if ?n > 1?. > > If ?n? equals ?1?, ?apply? returns a vector if ?MARGIN? has length > > 1 and an array of dimension ?dim(X)[MARGIN]? otherwise. If ?n? is > > ?0?, the result has length 0 but not necessarily the ?correct? > > dimension. > > > To determine 'n', the function *is* called once even when > length(X) == 0 > > It may indeed be would helpful to add this explicitly to the > help page ( <R>/src/library/base/man/apply.Rd ). > Can you propose a wording (in *.Rd if possible) ? > > With regards, > Martin > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]