thr3ads.net - R devel - [Rd] Why is the diag function so slow (for extraction)? [May 2015]

If this information is useful, please help other people find it:
Share via:

Martin Maechler

2015-May-12 12:33 UTC

[Rd] Why is the diag function so slow (for extraction)?

>>>>> Steve Bronder <sbronder at stevebronder.com>
>>>>>     on Thu, 7 May 2015 11:49:49 -0400 writes:
    > Is it possible to replace c() with .subset()? 

It would be possible, but I think "entirely" wrong.

.subset() is documented to be an internal function not to be
used "lightly" and more to the point it is documented to *NOT*
dispatch at all.

If you read and understood what Peter and Luke wrote, you'd not
special case here:

diag() should not work only for pure matrices, but for all
"matrix-like" objects for which ``the usual methods'' work,
such
as
   as.vector(.), c(.)

That's why there has been the c(.) in there.

You can always make code faster if you write the code so it only
has to work in one special case .. and work there very fast.

    > Example below
    > ####
    > ####

    > library(microbenchmark)

Henrik Bengtsson

2015-May-13 03:24 UTC

head link

[Rd] Why is the diag function so slow (for extraction)?

Along Luke's lines, would(n't) it be enough to look for existence of
attribute 'class' to decide whether to dispatch or not, i.e. if c() is
needed or not?  Even without .subset(), there is a remarkable
improvement.  I think it's worth condition the code on dispatch or
not.  For example:

[HB-X201]{hb}: svn diff diag.R
Index: diag.R
==================================================================--- diag.R    
(revision 68345)
+++ diag.R      (working copy)
@@ -23,9 +23,11 @@
             stop("'nrow' or 'ncol' cannot be specified
when 'x' is a matrix")

         if((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L))
+        nms <- dimnames(x)
+        nrow <- dim(x)[1L]
         ## NB: need double index to avoid overflows.
-        y <- c(x)[1 + 0L:(m - 1L) * (dim(x)[1L] + 1)]
-        nms <- dimnames(x)
+        if (!is.null(attr(x, "class"))) x <- c(x)
+        y <- x[1 + 0L:(m - 1L) * (nrow + 1)]
         if (is.list(nms) && !any(sapply(nms, is.null)) &&
             identical((nm <- nms[[1L]][seq_len(m)]), nms[[2L]][seq_len(m)]))
             names(y) <- nm

?

/Henrik

On Tue, May 12, 2015 at 5:33 AM, Martin Maechler
<maechler at lynne.stat.math.ethz.ch> wrote:>>>>>> Steve Bronder <sbronder at stevebronder.com>
>>>>>>     on Thu, 7 May 2015 11:49:49 -0400 writes:
>
>     > Is it possible to replace c() with .subset()?
>
> It would be possible, but I think "entirely" wrong.
>
> .subset() is documented to be an internal function not to be
> used "lightly" and more to the point it is documented to *NOT*
> dispatch at all.
>
> If you read and understood what Peter and Luke wrote, you'd not
> special case here:
>
> diag() should not work only for pure matrices, but for all
> "matrix-like" objects for which ``the usual methods''
work, such
> as
>    as.vector(.), c(.)
>
> That's why there has been the c(.) in there.
>
> You can always make code faster if you write the code so it only
> has to work in one special case .. and work there very fast.
>
>
>     > Example below
>     > ####
>     > ####
>
>     > library(microbenchmark)
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Henrik Bengtsson

2015-May-13 16:20 UTC

head link

[Rd] Why is the diag function so slow (for extraction)?

As kindly pointed out to me (oh my decaying gray matter), is.object()
is better suited for this test;

$ svn diff src/library/base/R/diag.R
Index: src/library/base/R/diag.R
==================================================================---
src/library/base/R/diag.R   (revision 68345)
+++ src/library/base/R/diag.R   (working copy)
@@ -23,9 +23,11 @@
             stop("'nrow' or 'ncol' cannot be specified
when 'x' is a matrix")

         if((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L))
+        nms <- dimnames(x)
+        nrow <- dim(x)[1L]
         ## NB: need double index to avoid overflows.
-        y <- c(x)[1 + 0L:(m - 1L) * (dim(x)[1L] + 1)]
-        nms <- dimnames(x)
+        if (is.object(x)) x <- c(x)

/Henrik

On Tue, May 12, 2015 at 8:24 PM, Henrik Bengtsson
<henrik.bengtsson at ucsf.edu> wrote:> Along Luke's lines, would(n't) it be enough to look for existence
of
> attribute 'class' to decide whether to dispatch or not, i.e. if c()
is
> needed or not?  Even without .subset(), there is a remarkable
> improvement.  I think it's worth condition the code on dispatch or
> not.  For example:
>
> [HB-X201]{hb}: svn diff diag.R
> Index: diag.R
> ==================================================================> ---
diag.R      (revision 68345)
> +++ diag.R      (working copy)
> @@ -23,9 +23,11 @@
>              stop("'nrow' or 'ncol' cannot be
specified when 'x' is a matrix")
>
>          if((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L))
> +        nms <- dimnames(x)
> +        nrow <- dim(x)[1L]
>          ## NB: need double index to avoid overflows.
> -        y <- c(x)[1 + 0L:(m - 1L) * (dim(x)[1L] + 1)]
> -        nms <- dimnames(x)
> +        if (!is.null(attr(x, "class"))) x <- c(x)
> +        y <- x[1 + 0L:(m - 1L) * (nrow + 1)]
>          if (is.list(nms) && !any(sapply(nms, is.null)) &&
>              identical((nm <- nms[[1L]][seq_len(m)]),
nms[[2L]][seq_len(m)]))
>              names(y) <- nm
>
> ?
>
> /Henrik
>
> On Tue, May 12, 2015 at 5:33 AM, Martin Maechler
> <maechler at lynne.stat.math.ethz.ch> wrote:
>>>>>>> Steve Bronder <sbronder at stevebronder.com>
>>>>>>>     on Thu, 7 May 2015 11:49:49 -0400 writes:
>>
>>     > Is it possible to replace c() with .subset()?
>>
>> It would be possible, but I think "entirely" wrong.
>>
>> .subset() is documented to be an internal function not to be
>> used "lightly" and more to the point it is documented to
*NOT*
>> dispatch at all.
>>
>> If you read and understood what Peter and Luke wrote, you'd not
>> special case here:
>>
>> diag() should not work only for pure matrices, but for all
>> "matrix-like" objects for which ``the usual methods''
work, such
>> as
>>    as.vector(.), c(.)
>>
>> That's why there has been the c(.) in there.
>>
>> You can always make code faster if you write the code so it only
>> has to work in one special case .. and work there very fast.
>>
>>
>>     > Example below
>>     > ####
>>     > ####
>>
>>     > library(microbenchmark)
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

Reasonably Related Threads

Search for more possibly parallel threads

R devel - May 2015 - Why is the diag function so slow (for extraction)?

[Rd] Why is the diag function so slow (for extraction)?

[Rd] Why is the diag function so slow (for extraction)?

[Rd] Why is the diag function so slow (for extraction)?

Reasonably Related Threads