I spent a lot of time searching and came up empty handed on the following query. Is there an equivalent to rowSums that does product or cumulative product and avoids use of apply or looping? I found a rowProd in a package but it was a convenience function for apply. As part of a likelihood calculation called from optim, I?m computing products and cumulative products of rows of matrices with far more rows than columns. I started with apply and after some thought realized that a loop of columns might be faster and it was substantially faster (see below). Because the likelihood function is called many times I?d like to speed it up even more if possible. Below is an example showing the cumprod.matrix and prod.matrix looping functions that I wrote and some timing comparisons to the use of apply for different column and row dimensions. At this point I?m better off with looping but I?d like to hear of any further suggestions. Thanks ?jeff > prod.matrix=function(x) + { + y=x[,1] + for(i in 2:dim(x)[2]) + y=y*x[,i] + return(y) + } > cumprod.matrix=function(x) + { + y=matrix(1,nrow=dim(x)[1],ncol=dim(x)[2]) + y[,1]=x[,1] + for (i in 2:dim(x)[2]) + y[,i]=y[,i-1]*x[,i] + return(y) + } > N=10000000 > xmat=matrix(runif(N),ncol=10) > system.time(cumprod.matrix(xmat)) user system elapsed 1.07 0.09 1.15 > system.time(t(apply(xmat,1,cumprod))) user system elapsed 29.27 0.21 29.50 > system.time(prod.matrix(xmat)) user system elapsed 0.29 0.00 0.30 > system.time(apply(xmat,1,prod)) user system elapsed 30.69 0.00 30.72 > xmat=matrix(runif(N),ncol=100) > system.time(cumprod.matrix(xmat)) user system elapsed 1.05 0.13 1.18 > system.time(t(apply(xmat,1,cumprod))) user system elapsed 3.55 0.14 3.70 > system.time(prod.matrix(xmat)) user system elapsed 0.38 0.01 0.39 > system.time(apply(xmat,1,prod)) user system elapsed 2.87 0.00 2.89 > xmat=matrix(runif(N),ncol=1000) > system.time(cumprod.matrix(xmat)) user system elapsed 1.30 0.18 1.46 > system.time(t(apply(xmat,1,cumprod))) user system elapsed 1.77 0.27 2.05 > system.time(prod.matrix(xmat)) user system elapsed 0.46 0.00 0.47 > system.time(apply(xmat,1,prod)) user system elapsed 0.7 0.0 0.7 > xmat=matrix(runif(N),ncol=10000) > system.time(cumprod.matrix(xmat)) user system elapsed 1.28 0.00 1.29 > system.time(t(apply(xmat,1,cumprod))) user system elapsed 1.19 0.08 1.26 > system.time(prod.matrix(xmat)) user system elapsed 0.40 0.00 0.41 > system.time(apply(xmat,1,prod)) user system elapsed 0.57 0.00 0.56 > xmat=matrix(runif(N),ncol=100000) > system.time(cumprod.matrix(xmat)) user system elapsed 3.18 0.00 3.19 > system.time(t(apply(xmat,1,cumprod))) user system elapsed 1.42 0.21 1.64 > system.time(prod.matrix(xmat)) user system elapsed 1.05 0.00 1.05 > system.time(apply(xmat,1,prod)) user system elapsed 0.82 0.00 0.81 > R.Version() $platform [1] "i386-pc-mingw32" . . . $version.string [1] "R version 2.7.1 (2008-06-23)"
On Sun, 17 Aug 2008, Jeff Laake wrote:> I spent a lot of time searching and came up empty handed on the following > query. Is there an equivalent to rowSums that does product or cumulative > product and avoids use of apply or looping? I found a rowProd in a package > but it was a convenience function for apply. As part of a likelihood > calculation called from optim, I?m computing products and cumulative products > of rows of matrices with far more rows than columns. I started with apply and > after some thought realized that a loop of columns might be faster and it was > substantially faster (see below). Because the likelihood function is called > many times I?d like to speed it up even more if possible.You might check out the 'inline' or 'jit' packages. Otherwise, if you can as easily treat xmat as a list (or data.frame), Reduce( "*", xmat.data.frame, accumulate=want.cumprod ) (where want.cumprod is FALSE for product, TRUE for cumulative product) will be a bit faster in many circumstances. However, this advantage is lost if you must retain xmat as a matrix since converting it to a data.frame seems to require more time than you save. HTH, Chuck> > Below is an example showing the cumprod.matrix and prod.matrix looping > functions that I wrote and some timing comparisons to the use of apply for > different column and row dimensions. At this point I?m better off with > looping but I?d like to hear of any further suggestions. > > Thanks ?jeff > >> prod.matrix=function(x) > + { > + y=x[,1] > + for(i in 2:dim(x)[2]) > + y=y*x[,i] > + return(y) > + } > >> cumprod.matrix=function(x) > + { > + y=matrix(1,nrow=dim(x)[1],ncol=dim(x)[2]) > + y[,1]=x[,1] > + for (i in 2:dim(x)[2]) > + y[,i]=y[,i-1]*x[,i] > + return(y) > + } > >> N=10000000 >> xmat=matrix(runif(N),ncol=10) >> system.time(cumprod.matrix(xmat)) > user system elapsed > 1.07 0.09 1.15 >> system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 29.27 0.21 29.50 >> system.time(prod.matrix(xmat)) > user system elapsed > 0.29 0.00 0.30 >> system.time(apply(xmat,1,prod)) > user system elapsed > 30.69 0.00 30.72 >> xmat=matrix(runif(N),ncol=100) >> system.time(cumprod.matrix(xmat)) > user system elapsed > 1.05 0.13 1.18 >> system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 3.55 0.14 3.70 >> system.time(prod.matrix(xmat)) > user system elapsed > 0.38 0.01 0.39 >> system.time(apply(xmat,1,prod)) > user system elapsed > 2.87 0.00 2.89 >> xmat=matrix(runif(N),ncol=1000) >> system.time(cumprod.matrix(xmat)) > user system elapsed > 1.30 0.18 1.46 >> system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 1.77 0.27 2.05 >> system.time(prod.matrix(xmat)) > user system elapsed > 0.46 0.00 0.47 >> system.time(apply(xmat,1,prod)) > user system elapsed > 0.7 0.0 0.7 >> xmat=matrix(runif(N),ncol=10000) >> system.time(cumprod.matrix(xmat)) > user system elapsed > 1.28 0.00 1.29 >> system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 1.19 0.08 1.26 >> system.time(prod.matrix(xmat)) > user system elapsed > 0.40 0.00 0.41 >> system.time(apply(xmat,1,prod)) > user system elapsed > 0.57 0.00 0.56 >> xmat=matrix(runif(N),ncol=100000) >> system.time(cumprod.matrix(xmat)) > user system elapsed > 3.18 0.00 3.19 >> system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 1.42 0.21 1.64 >> system.time(prod.matrix(xmat)) > user system elapsed > 1.05 0.00 1.05 >> system.time(apply(xmat,1,prod)) > user system elapsed > 0.82 0.00 0.81 >> R.Version() > $platform > [1] "i386-pc-mingw32" > . > . > . > $version.string > [1] "R version 2.7.1 (2008-06-23)" > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Hi Jeff, If I understand correctly, the overhead of a loop is that at each iteration the command must be interpreted, and this time is independent of the number of rows N. So if N is small this overhead may be very significant but when N is large this should be very small compared to the time needed to multiply N pairs of numbers. --- On Mon, 18/8/08, Jeff Laake <Jeff.Laake at noaa.gov> wrote:> From: Jeff Laake <Jeff.Laake at noaa.gov> > Subject: [R] matrix row product and cumulative product > To: r-help at r-project.org > Received: Monday, 18 August, 2008, 12:49 PM > I spent a lot of time searching and came up empty handed on > the > following query. Is there an equivalent to rowSums that > does product or > cumulative product and avoids use of apply or looping? I > found a rowProd > in a package but it was a convenience function for apply. > As part of a > likelihood calculation called from optim, I?m computing > products and > cumulative products of rows of matrices with far more rows > than columns. > I started with apply and after some thought realized that a > loop of > columns might be faster and it was substantially faster > (see below). > Because the likelihood function is called many times I?d > like to speed > it up even more if possible. > > Below is an example showing the cumprod.matrix and > prod.matrix looping > functions that I wrote and some timing comparisons to the > use of apply > for different column and row dimensions. At this point > I?m better off > with looping but I?d like to hear of any further > suggestions. > > Thanks ?jeff > > > prod.matrix=function(x) > + { > + y=x[,1] > + for(i in 2:dim(x)[2]) > + y=y*x[,i] > + return(y) > + } > > > cumprod.matrix=function(x) > + { > + y=matrix(1,nrow=dim(x)[1],ncol=dim(x)[2]) > + y[,1]=x[,1] > + for (i in 2:dim(x)[2]) > + y[,i]=y[,i-1]*x[,i] > + return(y) > + } > > > N=10000000 > > xmat=matrix(runif(N),ncol=10) > > system.time(cumprod.matrix(xmat)) > user system elapsed > 1.07 0.09 1.15 > > system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 29.27 0.21 29.50 > > system.time(prod.matrix(xmat)) > user system elapsed > 0.29 0.00 0.30 > > system.time(apply(xmat,1,prod)) > user system elapsed > 30.69 0.00 30.72 > > xmat=matrix(runif(N),ncol=100) > > system.time(cumprod.matrix(xmat)) > user system elapsed > 1.05 0.13 1.18 > > system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 3.55 0.14 3.70 > > system.time(prod.matrix(xmat)) > user system elapsed > 0.38 0.01 0.39 > > system.time(apply(xmat,1,prod)) > user system elapsed > 2.87 0.00 2.89 > > xmat=matrix(runif(N),ncol=1000) > > system.time(cumprod.matrix(xmat)) > user system elapsed > 1.30 0.18 1.46 > > system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 1.77 0.27 2.05 > > system.time(prod.matrix(xmat)) > user system elapsed > 0.46 0.00 0.47 > > system.time(apply(xmat,1,prod)) > user system elapsed > 0.7 0.0 0.7 > > xmat=matrix(runif(N),ncol=10000) > > system.time(cumprod.matrix(xmat)) > user system elapsed > 1.28 0.00 1.29 > > system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 1.19 0.08 1.26 > > system.time(prod.matrix(xmat)) > user system elapsed > 0.40 0.00 0.41 > > system.time(apply(xmat,1,prod)) > user system elapsed > 0.57 0.00 0.56 > > xmat=matrix(runif(N),ncol=100000) > > system.time(cumprod.matrix(xmat)) > user system elapsed > 3.18 0.00 3.19 > > system.time(t(apply(xmat,1,cumprod))) > user system elapsed > 1.42 0.21 1.64 > > system.time(prod.matrix(xmat)) > user system elapsed > 1.05 0.00 1.05 > > system.time(apply(xmat,1,prod)) > user system elapsed > 0.82 0.00 0.81 > > R.Version() > $platform > [1] "i386-pc-mingw32" > . > . > . > $version.string > [1] "R version 2.7.1 (2008-06-23)" > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code.