thr3ads.net - R help - [R] A slightly unorthodox matrix product. [Aug 2018]

If this information is useful, please help other people find it:
Share via:

Berry, Charles

2018-Aug-04 19:31 UTC

[R] A slightly unorthodox matrix product.

> On Aug 4, 2018, at 11:43 AM, Jeff Newmiller <jdnewmil at
dcn.davis.ca.us> wrote:
> 
> Sometimes a good old for loop performs best, even if it doesn't look
sexy:
> 
> 
Fair enough, but a vectorized solution beats them all (see below).

Also,

[SNIP]

> # Charles
> ans1b <- function( a, b )
> {
The lapply you put here was from Eric's solution:
>  xxx <- lapply( seq.int( nrow( A ) )
>               , function( i ) {
>                    A[ i, ] %o% B[ i, ]
>                 }

This is what I had in mind:

ans1b.corrected <- function( a, b ) {
  yyy <- sapply( seq.int( nrow( a ) )
                 , function( i ) a[ i, ] %o% b[ i, ]
                 , simplify = "array"
  )
  zzz <- aperm( yyy, c( 3, 1, 2 ) )
  zzz
}

On my system it is slower than a for loop but a lot faster than your benchmark
showed with the superfluous code from Eric's solution.

For speed, a vectorized solution is faster than a for loop by a factor of 3 on
my laptop:

ans0 <- function(A,B){
  nca <- ncol(A)
  ncb <- ncol(B)
  j.index <- rep(1:nca, times = ncb)
  k.index <- rep(1:nca, each = ncb)
  res <- array(A[, j.index] * B[, k.index], c(nrow(A), nca, ncb))
  res
  }

> microbenchmark(+   res0 <- ans0(A, B),
+   res1 <- ans1(A, B),
+   res1b <- ans1b.corrected(A, B),
+   res2 <- ans2(A, B),
+   res3 <- ans3(A, B)
+ )
Unit: microseconds
                           expr     min       lq      mean   median       uq    
max neval   cld
             res0 <- ans0(A, B)  13.281  18.4960  21.52723  19.9905  23.4750 
61.556   100 a
             res1 <- ans1(A, B) 353.121 369.8635 409.77788 381.5840 444.3290
701.256   100     e
 res1b <- ans1b.corrected(A, B)  82.816  89.4185 101.52321  95.4275 107.1700
217.357   100   c
             res2 <- ans2(A, B)  49.674  54.4825  61.78278  58.7540  65.5265
172.625   100  b
             res3 <- ans3(A, B) 317.772 342.4220 392.25065 360.4675 436.2125
602.346   100    d > 

FWIW, if there are dimnames on A and B, sapply( row names(A), ...,
simplify="array") preserves them without further ado.

Chuck

Jeff Newmiller

2018-Aug-04 19:59 UTC

head link

[R] A slightly unorthodox matrix product.

Sorry I missed your intent on the sapply.

Slick work on the vectorizing, but for the future reference it was 
slightly buggy:

#######
A <- matrix( 1:12, nrow = 3 )
B <- matrix( 1:15, nrow = 3 )

# for loop
ans2 <- function( a, b ) {
   zzz <- array( rep( NA, nrow( a ) * ncol( a ) * ncol( b ) )
               , dim = c( nrow( a ), ncol( a ), ncol( b ) )
               )
   jseq <- seq.int( ncol( a ) )
   kseq <- seq.int( ncol( b ) )
   for ( i in seq.int( nrow( a ) ) ) {
     zzz[ i, jseq, kseq ] <- outer( a[ i, ], b[ i, ] )
   }
   zzz
}

# fast but buggy
ans0 <- function( A, B ) {
   nca <- ncol( A )
   ncb <- ncol( B )
   j.index <- rep( seq.int( nca ), times = ncb)
   k.index <- rep( seq.int( nca ), each = ncb)
   res <- array( A[ , j.index ] * B[ , k.index ]
               , c( nrow( A ), nca, ncb )
               )
   res
   }

# bugfixed
ans0b <- function( A, B ) {
   nca <- ncol( A )
   ncb <- ncol( B )
   j.index <- rep( seq.int( nca ), times = ncb )
   k.index <- rep( seq.int( ncb ), each = nca )
   res <- array( A[ , j.index ] * B[ , k.index ]
               , c( nrow( A ), nca, ncb )
               )
   res
   }

library(microbenchmark)

microbenchmark( res2 <- ans2( A, B )
               , res0b <- ans0b( A, B )
               , res0 <- ans0( A, B )
               )
#> Unit: microseconds
#>                  expr    min      lq     mean  median      uq      max
#>    res2 <- ans2(A, B) 84.987 87.8185 270.2153 96.4315 99.4175 17531.77
#>  res0b <- ans0b(A, B) 17.940 19.2055 126.8974 20.8800 22.2865 10616.36
#>    res0 <- ans0(A, B) 18.041 19.1670 126.1183 20.5530 21.9545 10532.44
#>  neval
#>    100
#>    100
#>    100
all( res2 == res0 )
#> [1] FALSE
all( res2 == res0b )
#> [1] TRUE

#' Created on 2018-08-04 by the [reprex
package](http://reprex.tidyverse.org) (v0.2.0).
#######

On Sat, 4 Aug 2018, Berry, Charles wrote:
>
>
>> On Aug 4, 2018, at 11:43 AM, Jeff Newmiller <jdnewmil at
dcn.davis.ca.us> wrote:
>>
>> Sometimes a good old for loop performs best, even if it doesn't
look sexy:
>>
>>
>
> Fair enough, but a vectorized solution beats them all (see below).
>
> Also,
>
> [SNIP]
>
>
>> # Charles
>> ans1b <- function( a, b )
>> {
>
> The lapply you put here was from Eric's solution:
>
>>  xxx <- lapply( seq.int( nrow( A ) )
>>               , function( i ) {
>>                    A[ i, ] %o% B[ i, ]
>>                 }
>
>
> This is what I had in mind:
>
> ans1b.corrected <- function( a, b ) {
>  yyy <- sapply( seq.int( nrow( a ) )
>                 , function( i ) a[ i, ] %o% b[ i, ]
>                 , simplify = "array"
>  )
>  zzz <- aperm( yyy, c( 3, 1, 2 ) )
>  zzz
> }
>
> On my system it is slower than a for loop but a lot faster than your
benchmark showed with the superfluous code from Eric's solution.
>
> For speed, a vectorized solution is faster than a for loop by a factor of 3
on my laptop:
>
> ans0 <- function(A,B){
>  nca <- ncol(A)
>  ncb <- ncol(B)
>  j.index <- rep(1:nca, times = ncb)
>  k.index <- rep(1:nca, each = ncb)
>  res <- array(A[, j.index] * B[, k.index], c(nrow(A), nca, ncb))
>  res
>  }
>
>
>> microbenchmark(
> +   res0 <- ans0(A, B),
> +   res1 <- ans1(A, B),
> +   res1b <- ans1b.corrected(A, B),
> +   res2 <- ans2(A, B),
> +   res3 <- ans3(A, B)
> + )
> Unit: microseconds
>                           expr     min       lq      mean   median       uq
max neval   cld
>             res0 <- ans0(A, B)  13.281  18.4960  21.52723  19.9905 
23.4750  61.556   100 a
>             res1 <- ans1(A, B) 353.121 369.8635 409.77788 381.5840
444.3290 701.256   100     e
> res1b <- ans1b.corrected(A, B)  82.816  89.4185 101.52321  95.4275
107.1700 217.357   100   c
>             res2 <- ans2(A, B)  49.674  54.4825  61.78278  58.7540 
65.5265 172.625   100  b
>             res3 <- ans3(A, B) 317.772 342.4220 392.25065 360.4675
436.2125 602.346   100    d
>>
>
>
> FWIW, if there are dimnames on A and B, sapply( row names(A), ...,
simplify="array") preserves them without further ado.
>
> Chuck
>
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

Berry, Charles

2018-Aug-04 19:59 UTC

head link

[R] A slightly unorthodox matrix product.

> On Aug 4, 2018, at 12:59 PM, Jeff Newmiller <jdnewmil at
dcn.davis.ca.us> wrote:
> ...
> Slick work on the vectorizing, but for the future reference it was slightly
buggy:
> 
Thanks for catching that!

Chuck

R help - Aug 2018 - A slightly unorthodox matrix product.

[R] A slightly unorthodox matrix product.

[R] A slightly unorthodox matrix product.

[R] A slightly unorthodox matrix product.