Martin Maechler
2015-May-12 12:33 UTC
[Rd] Why is the diag function so slow (for extraction)?
>>>>> Steve Bronder <sbronder at stevebronder.com> >>>>> on Thu, 7 May 2015 11:49:49 -0400 writes:> Is it possible to replace c() with .subset()? It would be possible, but I think "entirely" wrong. .subset() is documented to be an internal function not to be used "lightly" and more to the point it is documented to *NOT* dispatch at all. If you read and understood what Peter and Luke wrote, you'd not special case here: diag() should not work only for pure matrices, but for all "matrix-like" objects for which ``the usual methods'' work, such as as.vector(.), c(.) That's why there has been the c(.) in there. You can always make code faster if you write the code so it only has to work in one special case .. and work there very fast. > Example below > #### > #### > library(microbenchmark)
Henrik Bengtsson
2015-May-13 03:24 UTC
[Rd] Why is the diag function so slow (for extraction)?
Along Luke's lines, would(n't) it be enough to look for existence of attribute 'class' to decide whether to dispatch or not, i.e. if c() is needed or not? Even without .subset(), there is a remarkable improvement. I think it's worth condition the code on dispatch or not. For example: [HB-X201]{hb}: svn diff diag.R Index: diag.R ==================================================================--- diag.R (revision 68345) +++ diag.R (working copy) @@ -23,9 +23,11 @@ stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix") if((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L)) + nms <- dimnames(x) + nrow <- dim(x)[1L] ## NB: need double index to avoid overflows. - y <- c(x)[1 + 0L:(m - 1L) * (dim(x)[1L] + 1)] - nms <- dimnames(x) + if (!is.null(attr(x, "class"))) x <- c(x) + y <- x[1 + 0L:(m - 1L) * (nrow + 1)] if (is.list(nms) && !any(sapply(nms, is.null)) && identical((nm <- nms[[1L]][seq_len(m)]), nms[[2L]][seq_len(m)])) names(y) <- nm ? /Henrik On Tue, May 12, 2015 at 5:33 AM, Martin Maechler <maechler at lynne.stat.math.ethz.ch> wrote:>>>>>> Steve Bronder <sbronder at stevebronder.com> >>>>>> on Thu, 7 May 2015 11:49:49 -0400 writes: > > > Is it possible to replace c() with .subset()? > > It would be possible, but I think "entirely" wrong. > > .subset() is documented to be an internal function not to be > used "lightly" and more to the point it is documented to *NOT* > dispatch at all. > > If you read and understood what Peter and Luke wrote, you'd not > special case here: > > diag() should not work only for pure matrices, but for all > "matrix-like" objects for which ``the usual methods'' work, such > as > as.vector(.), c(.) > > That's why there has been the c(.) in there. > > You can always make code faster if you write the code so it only > has to work in one special case .. and work there very fast. > > > > Example below > > #### > > #### > > > library(microbenchmark) > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Henrik Bengtsson
2015-May-13 16:20 UTC
[Rd] Why is the diag function so slow (for extraction)?
As kindly pointed out to me (oh my decaying gray matter), is.object() is better suited for this test; $ svn diff src/library/base/R/diag.R Index: src/library/base/R/diag.R ==================================================================--- src/library/base/R/diag.R (revision 68345) +++ src/library/base/R/diag.R (working copy) @@ -23,9 +23,11 @@ stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix") if((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L)) + nms <- dimnames(x) + nrow <- dim(x)[1L] ## NB: need double index to avoid overflows. - y <- c(x)[1 + 0L:(m - 1L) * (dim(x)[1L] + 1)] - nms <- dimnames(x) + if (is.object(x)) x <- c(x) /Henrik On Tue, May 12, 2015 at 8:24 PM, Henrik Bengtsson <henrik.bengtsson at ucsf.edu> wrote:> Along Luke's lines, would(n't) it be enough to look for existence of > attribute 'class' to decide whether to dispatch or not, i.e. if c() is > needed or not? Even without .subset(), there is a remarkable > improvement. I think it's worth condition the code on dispatch or > not. For example: > > [HB-X201]{hb}: svn diff diag.R > Index: diag.R > ==================================================================> --- diag.R (revision 68345) > +++ diag.R (working copy) > @@ -23,9 +23,11 @@ > stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix") > > if((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L)) > + nms <- dimnames(x) > + nrow <- dim(x)[1L] > ## NB: need double index to avoid overflows. > - y <- c(x)[1 + 0L:(m - 1L) * (dim(x)[1L] + 1)] > - nms <- dimnames(x) > + if (!is.null(attr(x, "class"))) x <- c(x) > + y <- x[1 + 0L:(m - 1L) * (nrow + 1)] > if (is.list(nms) && !any(sapply(nms, is.null)) && > identical((nm <- nms[[1L]][seq_len(m)]), nms[[2L]][seq_len(m)])) > names(y) <- nm > > ? > > /Henrik > > On Tue, May 12, 2015 at 5:33 AM, Martin Maechler > <maechler at lynne.stat.math.ethz.ch> wrote: >>>>>>> Steve Bronder <sbronder at stevebronder.com> >>>>>>> on Thu, 7 May 2015 11:49:49 -0400 writes: >> >> > Is it possible to replace c() with .subset()? >> >> It would be possible, but I think "entirely" wrong. >> >> .subset() is documented to be an internal function not to be >> used "lightly" and more to the point it is documented to *NOT* >> dispatch at all. >> >> If you read and understood what Peter and Luke wrote, you'd not >> special case here: >> >> diag() should not work only for pure matrices, but for all >> "matrix-like" objects for which ``the usual methods'' work, such >> as >> as.vector(.), c(.) >> >> That's why there has been the c(.) in there. >> >> You can always make code faster if you write the code so it only >> has to work in one special case .. and work there very fast. >> >> >> > Example below >> > #### >> > #### >> >> > library(microbenchmark) >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel