Sara wins on memory use. Rui wins on speed. Bert wins on clarity. library(microbenchmark) N <- 1000 x <- matrix( runif( N*N ), ncol=N ) y <- seq.int( N ) microbenchmark( { t( y * t(x) ) } , { x %*% diag( y ) } , { sweep( x, 2, y, `*` ) } ) Unit: milliseconds expr min lq median uq max neval { t(y * t(x)) } 6.659562 7.475414 7.871341 8.182623 47.01105 100 { x %*% diag(y) } 9.859292 11.014021 11.281334 11.733825 48.79463 100 { sweep(x, 2, y, `*`) } 16.535938 17.682175 18.283572 18.712342 55.47159 100 On Fri, 4 Nov 2016, Dimitri Liakhovitski wrote:> Nice! > Thanks a lot, everybody! > Dimitri > > On Fri, Nov 4, 2016 at 10:35 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote: >> My goodness! >> >>> x %*% diag(y) >> >> [,1] [,2] >> [1,] 2 12 >> [2,] 4 15 >> [3,] 6 18 >> >> will do. >> >> -- Bert >> >> >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Thu, Nov 3, 2016 at 2:33 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote: >>> Like this? >>> >>>> sweep(x, 2, y, "*") >>> [,1] [,2] >>> [1,] 2 12 >>> [2,] 4 15 >>> [3,] 6 18 >>>> >>> >>> >>> On Thu, Nov 3, 2016 at 5:05 PM, Dimitri Liakhovitski >>> <dimitri.liakhovitski at gmail.com> wrote: >>>> Hello! >>>> >>>> I have a matrix x and a vector y: >>>> >>>> x <- matrix(1:6, ncol = 2) >>>> y <- c(2,3) >>>> >>>> I need to multiply the first column of x by 2 (y[1]) and the second >>>> column of x by 3 (y[2]). >>>> >>>> Of course, I could do this - but it's column by column: >>>> >>>> x[,1] <- x[,1] * y[1] >>>> x[,2] <- x[,2] * y[2] >>>> x >>>> >>>> Or I could repeat each element of y and multiply two matrices - that's better: >>>> >>>> rep.row<-function(x,n){ >>>> matrix(rep(x,each=n),nrow=n) >>>> } >>>> y <- rep.row(y, nrow(x)) >>>> x * y >>>> >>>> However, maybe there is a more elegant r-like way of doing it? >>>> Thank you! >>>> >>>> -- >>>> Dimitri Liakhovitski >>>> >>> >>> -- >>> Sarah Goslee >>> http://www.functionaldiversity.org >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Dimitri Liakhovitski > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
Notice though, that Bert loses (or _should_ lose) for larger values of N, since that method involves O(N^3) operations whereas the other two are O(N^2). I am a bit surprised that sweep() is so inefficient even at N=1000. -pd On 04 Nov 2016, at 16:41 , Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> Sara wins on memory use. > > Rui wins on speed. > > Bert wins on clarity. > > library(microbenchmark) > > N <- 1000 > x <- matrix( runif( N*N ), ncol=N ) > y <- seq.int( N ) > > microbenchmark( { t( y * t(x) ) } > , { x %*% diag( y ) } > , { sweep( x, 2, y, `*` ) } > ) > Unit: milliseconds > expr min lq median uq max neval > { t(y * t(x)) } 6.659562 7.475414 7.871341 8.182623 47.01105 100 > { x %*% diag(y) } 9.859292 11.014021 11.281334 11.733825 48.79463 100 > { sweep(x, 2, y, `*`) } 16.535938 17.682175 18.283572 18.712342 55.47159 100 > > On Fri, 4 Nov 2016, Dimitri Liakhovitski wrote: > >> Nice! >> Thanks a lot, everybody! >> Dimitri >> >> On Fri, Nov 4, 2016 at 10:35 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote: >>> My goodness! >>> >>>> x %*% diag(y) >>> >>> [,1] [,2] >>> [1,] 2 12 >>> [2,] 4 15 >>> [3,] 6 18 >>> >>> will do. >>> >>> -- Bert >>> >>> >>> >>> Bert Gunter >>> >>> "The trouble with having an open mind is that people keep coming along >>> and sticking things into it." >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> >>> >>> On Thu, Nov 3, 2016 at 2:33 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote: >>>> Like this? >>>> >>>>> sweep(x, 2, y, "*") >>>> [,1] [,2] >>>> [1,] 2 12 >>>> [2,] 4 15 >>>> [3,] 6 18 >>>>> >>>> >>>> >>>> On Thu, Nov 3, 2016 at 5:05 PM, Dimitri Liakhovitski >>>> <dimitri.liakhovitski at gmail.com> wrote: >>>>> Hello! >>>>> >>>>> I have a matrix x and a vector y: >>>>> >>>>> x <- matrix(1:6, ncol = 2) >>>>> y <- c(2,3) >>>>> >>>>> I need to multiply the first column of x by 2 (y[1]) and the second >>>>> column of x by 3 (y[2]). >>>>> >>>>> Of course, I could do this - but it's column by column: >>>>> >>>>> x[,1] <- x[,1] * y[1] >>>>> x[,2] <- x[,2] * y[2] >>>>> x >>>>> >>>>> Or I could repeat each element of y and multiply two matrices - that's better: >>>>> >>>>> rep.row<-function(x,n){ >>>>> matrix(rep(x,each=n),nrow=n) >>>>> } >>>>> y <- rep.row(y, nrow(x)) >>>>> x * y >>>>> >>>>> However, maybe there is a more elegant r-like way of doing it? >>>>> Thank you! >>>>> >>>>> -- >>>>> Dimitri Liakhovitski >>>>> >>>> >>>> -- >>>> Sarah Goslee >>>> http://www.functionaldiversity.org >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> -- >> Dimitri Liakhovitski >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
> On 4 Nov 2016, at 16:41, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: > > Sara wins on memory use. > > Rui wins on speed. > > Bert wins on clarity. > > library(microbenchmark) > > N <- 1000 > x <- matrix( runif( N*N ), ncol=N ) > y <- seq.int( N ) > > microbenchmark( { t( y * t(x) ) } > , { x %*% diag( y ) } > , { sweep( x, 2, y, `*` ) } > ) > Unit: milliseconds > expr min lq median uq max neval > { t(y * t(x)) } 6.659562 7.475414 7.871341 8.182623 47.01105 100 > { x %*% diag(y) } 9.859292 11.014021 11.281334 11.733825 48.79463 100 > { sweep(x, 2, y, `*`) } 16.535938 17.682175 18.283572 18.712342 55.47159 100I get different results with R3.2.2 on Mac OS X (using reference BLAS). library(rbenchmark) N <- 1000 x <- matrix( runif( N*N ), ncol=N ) y <- seq.int( N ) benchmark( t( y * t(x) ), x %*% diag( y ), sweep( x, 2, y, `*` ) , columns=c("test","elapsed","relative"), replications=10 ) # test elapsed relative # 3 sweep(x, 2, y, `*`) 0.132 1.000 # 1 t(y * t(x)) 0.189 1.432 # 2 x %*% diag(y) 7.928 60.061 library(microbenchmark) microbenchmark( { t( y * t(x) ) } , { x %*% diag( y ) } , { sweep( x, 2, y, `*` ) }, times=10, unit="s" ) # Unit: seconds # expr min lq mean median uq max neval cld # { t(y * t(x)) } 0.01099410 0.01132096 0.01597058 0.01332414 0.01430672 0.04447432 10 a # { x %*% diag(y) } 0.76588887 0.78581717 0.79878255 0.79811626 0.80607877 0.82903945 10 b # { sweep(x, 2, y, `*`) } 0.01177478 0.01246777 0.01409457 0.01314718 0.01600818 0.01802171 10 a sweep appears to very good. I don't quite understand why I get a very different ranking. Berend
> On 4 Nov 2016, at 19:27, Berend Hasselman <bhh at xs4all.nl> wrote: > >> >> On 4 Nov 2016, at 16:41, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: >> >> Sara wins on memory use. >> >> Rui wins on speed. >> >> Bert wins on clarity. >> >> library(microbenchmark) >> >> N <- 1000 >> x <- matrix( runif( N*N ), ncol=N ) >> y <- seq.int( N ) >> >> microbenchmark( { t( y * t(x) ) } >> , { x %*% diag( y ) } >> , { sweep( x, 2, y, `*` ) } >> ) >> Unit: milliseconds >> expr min lq median uq max neval >> { t(y * t(x)) } 6.659562 7.475414 7.871341 8.182623 47.01105 100 >> { x %*% diag(y) } 9.859292 11.014021 11.281334 11.733825 48.79463 100 >> { sweep(x, 2, y, `*`) } 16.535938 17.682175 18.283572 18.712342 55.47159 100 > > > I get different results with R3.2.2 on Mac OS X (using reference BLAS). >Correction: I meant R3.3.2 Berend
Berend, et. al.: "I don't quite understand why I get a very different ranking." Nor do I. But, as Peter noted, my matrix multiplication solution "should" be worst in terms of efficiency, and it clearly is. The other two *should* be "similar", and they are. So your results are consistent with what one should expect, whereas, as Peter noted, Jeff's are not. But as we all know, *actual* efficiency in use can be tricky to determine (asymptopia is always questionable), as it may depend on state, problem details, exact algorithms in use, etc. Which is why the standard first principle in code optimization is: don't. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Nov 4, 2016 at 11:27 AM, Berend Hasselman <bhh at xs4all.nl> wrote:> >> On 4 Nov 2016, at 16:41, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: >> >> Sara wins on memory use. >> >> Rui wins on speed. >> >> Bert wins on clarity. >> >> library(microbenchmark) >> >> N <- 1000 >> x <- matrix( runif( N*N ), ncol=N ) >> y <- seq.int( N ) >> >> microbenchmark( { t( y * t(x) ) } >> , { x %*% diag( y ) } >> , { sweep( x, 2, y, `*` ) } >> ) >> Unit: milliseconds >> expr min lq median uq max neval >> { t(y * t(x)) } 6.659562 7.475414 7.871341 8.182623 47.01105 100 >> { x %*% diag(y) } 9.859292 11.014021 11.281334 11.733825 48.79463 100 >> { sweep(x, 2, y, `*`) } 16.535938 17.682175 18.283572 18.712342 55.47159 100 > > > I get different results with R3.2.2 on Mac OS X (using reference BLAS). > > > library(rbenchmark) > > N <- 1000 > x <- matrix( runif( N*N ), ncol=N ) > y <- seq.int( N ) > > benchmark( t( y * t(x) ), x %*% diag( y ), sweep( x, 2, y, `*` ) , > columns=c("test","elapsed","relative"), replications=10 > ) > > # test elapsed relative > # 3 sweep(x, 2, y, `*`) 0.132 1.000 > # 1 t(y * t(x)) 0.189 1.432 > # 2 x %*% diag(y) 7.928 60.061 > > library(microbenchmark) > microbenchmark( { t( y * t(x) ) } > , { x %*% diag( y ) } > , { sweep( x, 2, y, `*` ) }, > times=10, unit="s" > ) > > # Unit: seconds > # expr min lq mean median uq max neval cld > # { t(y * t(x)) } 0.01099410 0.01132096 0.01597058 0.01332414 0.01430672 0.04447432 10 a > # { x %*% diag(y) } 0.76588887 0.78581717 0.79878255 0.79811626 0.80607877 0.82903945 10 b > # { sweep(x, 2, y, `*`) } 0.01177478 0.01246777 0.01409457 0.01314718 0.01600818 0.01802171 10 a > > > sweep appears to very good. > > I don't quite understand why I get a very different ranking. > > Berend > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
> On 4 Nov 2016, at 17:35, peter dalgaard <pdalgd at gmail.com> wrote: > > Notice though, that Bert loses (or _should_ lose) for larger values of N, since that method involves O(N^3) operations whereas the other two are O(N^2). I am a bit surprised that sweep() is so inefficient even at N=1000.I also got different results on Mac OS X (with R 3.3.1 and vecLib BLAS): Unit: milliseconds expr min lq mean median uq max neval { t(y * t(x)) } 18.46369 19.61954 25.77548 21.24242 22.72943 88.03469 100 { x %*% diag(y) } 28.12786 29.87109 37.83814 31.59671 32.77839 101.68553 100 { sweep(x, 2, y, `*`) } 11.19871 12.76442 21.48670 14.51618 15.70665 100.51444 100 Note that sweep() is considerably _faster_ than the other two methods (which I found quite surprising). Of course, if you care about speed (and memory efficiency), you could also library(wordspace) scaleMargins(x, cols=y) Best, Stefan