It?s a good suggestion. Multiplication in this case is over 7 columns in the data, but the number of rows is millions. Unfortunately, the values are negative as these are actually gauss-quad nodes used to evaluate a multidimensional integral. colSums is better than something like apply(dat, 2, sum); I was hoping there was something similar to colSums/rowSums using prod(). On 11/8/16, 3:00 PM, "Fox, John" <jfox at mcmaster.ca> wrote:>Dear Harold, > >If the actual data with which you're dealing are non-negative, you could >log all the values, and use colSums() on the logs. That might also have >the advantage of greater numerical accuracy than multiplying millions of >numbers. Depending on the numbers, the products may be too large or small >to be represented. Of course, logs won't work with your toy example, >where rnorm() will generate values that are both negative and positive. > >I hope this helps, > John >----------------------------- >John Fox, Professor >McMaster University >Hamilton, Ontario >Canada L8S 4M4 >web: socserv.mcmaster.ca/jfox > > >________________________________________ >From: R-help [r-help-bounces at r-project.org] on behalf of Doran, Harold >[HDoran at air.org] >Sent: November 8, 2016 10:57 AM >To: r-help at r-project.org >Subject: [R] Alternative to apply in base R > >Without reaching out to another package in R, I wonder what the best way >is to speed enhance the following toy example? Over the years I have >become very comfortable with the family of apply functions and generally >not good at finding an improvement for speed. > >This toy example is small, but my real data has many millions of rows and >the same operations is repeated many times and so finding a less >expensive alternative would be helpful. > >mm <- matrix(rnorm(100), ncol = 10) >rn <- apply(mm, 1, prod) > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Log-sum-antilog is faster than apply by several times, but vector multiplication in a for loop as David and Chuck have suggested is several times faster than that. -- Sent from my phone. Please excuse my brevity. On November 8, 2016 12:23:04 PM PST, "Doran, Harold" <HDoran at air.org> wrote:>It?s a good suggestion. Multiplication in this case is over 7 columns >in >the data, but the number of rows is millions. Unfortunately, the values >are negative as these are actually gauss-quad nodes used to evaluate a >multidimensional integral. > >colSums is better than something like apply(dat, 2, sum); I was hoping >there was something similar to colSums/rowSums using prod(). > >On 11/8/16, 3:00 PM, "Fox, John" <jfox at mcmaster.ca> wrote: > >>Dear Harold, >> >>If the actual data with which you're dealing are non-negative, you >could >>log all the values, and use colSums() on the logs. That might also >have >>the advantage of greater numerical accuracy than multiplying millions >of >>numbers. Depending on the numbers, the products may be too large or >small >>to be represented. Of course, logs won't work with your toy example, >>where rnorm() will generate values that are both negative and >positive. >> >>I hope this helps, >> John >>----------------------------- >>John Fox, Professor >>McMaster University >>Hamilton, Ontario >>Canada L8S 4M4 >>web: socserv.mcmaster.ca/jfox >> >> >>________________________________________ >>From: R-help [r-help-bounces at r-project.org] on behalf of Doran, Harold >>[HDoran at air.org] >>Sent: November 8, 2016 10:57 AM >>To: r-help at r-project.org >>Subject: [R] Alternative to apply in base R >> >>Without reaching out to another package in R, I wonder what the best >way >>is to speed enhance the following toy example? Over the years I have >>become very comfortable with the family of apply functions and >generally >>not good at finding an improvement for speed. >> >>This toy example is small, but my real data has many millions of rows >and >>the same operations is repeated many times and so finding a less >>expensive alternative would be helpful. >> >>mm <- matrix(rnorm(100), ncol = 10) >>rn <- apply(mm, 1, prod) >> >> [[alternative HTML version deleted]] >> >>______________________________________________ >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
> On 08 Nov 2016, at 21:23 , Doran, Harold <HDoran at air.org> wrote: > > It?s a good suggestion. Multiplication in this case is over 7 columns in > the data, but the number of rows is millions. Unfortunately, the values > are negative as these are actually gauss-quad nodes used to evaluate a > multidimensional integral.If there really are only 7 cols, then there's also the blindingly obvious mm[,1]*mm[,2]*mm[,3]*mm[,4]*mm[,5]*mm[,6]*mm[,7] -pd> > colSums is better than something like apply(dat, 2, sum); I was hoping > there was something similar to colSums/rowSums using prod(). > > On 11/8/16, 3:00 PM, "Fox, John" <jfox at mcmaster.ca> wrote: > >> Dear Harold, >> >> If the actual data with which you're dealing are non-negative, you could >> log all the values, and use colSums() on the logs. That might also have >> the advantage of greater numerical accuracy than multiplying millions of >> numbers. Depending on the numbers, the products may be too large or small >> to be represented. Of course, logs won't work with your toy example, >> where rnorm() will generate values that are both negative and positive. >> >> I hope this helps, >> John >> ----------------------------- >> John Fox, Professor >> McMaster University >> Hamilton, Ontario >> Canada L8S 4M4 >> web: socserv.mcmaster.ca/jfox >> >> >> ________________________________________ >> From: R-help [r-help-bounces at r-project.org] on behalf of Doran, Harold >> [HDoran at air.org] >> Sent: November 8, 2016 10:57 AM >> To: r-help at r-project.org >> Subject: [R] Alternative to apply in base R >> >> Without reaching out to another package in R, I wonder what the best way >> is to speed enhance the following toy example? Over the years I have >> become very comfortable with the family of apply functions and generally >> not good at finding an improvement for speed. >> >> This toy example is small, but my real data has many millions of rows and >> the same operations is repeated many times and so finding a less >> expensive alternative would be helpful. >> >> mm <- matrix(rnorm(100), ncol = 10) >> rn <- apply(mm, 1, prod) >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Well, I wish R-help had a ?like? button as I would most certainly like this reply :) As usual, you?re right. I should have added a disclaimer that ?in this instance? there are 7 columns as the function I wrote evaluates an N-dimensional integral and so as the dimensions change, so do the number of columns in this matrix (plus another factor). But the number of columns is never all that large. On 11/8/16, 4:37 PM, "peter dalgaard" <pdalgd at gmail.com> wrote:> >> On 08 Nov 2016, at 21:23 , Doran, Harold <HDoran at air.org> wrote: >> >> It?s a good suggestion. Multiplication in this case is over 7 columns in >> the data, but the number of rows is millions. Unfortunately, the values >> are negative as these are actually gauss-quad nodes used to evaluate a >> multidimensional integral. > >If there really are only 7 cols, then there's also the blindingly obvious > >mm[,1]*mm[,2]*mm[,3]*mm[,4]*mm[,5]*mm[,6]*mm[,7] > >-pd > > >> >> colSums is better than something like apply(dat, 2, sum); I was hoping >> there was something similar to colSums/rowSums using prod(). >> >> On 11/8/16, 3:00 PM, "Fox, John" <jfox at mcmaster.ca> wrote: >> >>> Dear Harold, >>> >>> If the actual data with which you're dealing are non-negative, you >>>could >>> log all the values, and use colSums() on the logs. That might also have >>> the advantage of greater numerical accuracy than multiplying millions >>>of >>> numbers. Depending on the numbers, the products may be too large or >>>small >>> to be represented. Of course, logs won't work with your toy example, >>> where rnorm() will generate values that are both negative and positive. >>> >>> I hope this helps, >>> John >>> ----------------------------- >>> John Fox, Professor >>> McMaster University >>> Hamilton, Ontario >>> Canada L8S 4M4 >>> web: socserv.mcmaster.ca/jfox >>> >>> >>> ________________________________________ >>> From: R-help [r-help-bounces at r-project.org] on behalf of Doran, Harold >>> [HDoran at air.org] >>> Sent: November 8, 2016 10:57 AM >>> To: r-help at r-project.org >>> Subject: [R] Alternative to apply in base R >>> >>> Without reaching out to another package in R, I wonder what the best >>>way >>> is to speed enhance the following toy example? Over the years I have >>> become very comfortable with the family of apply functions and >>>generally >>> not good at finding an improvement for speed. >>> >>> This toy example is small, but my real data has many millions of rows >>>and >>> the same operations is repeated many times and so finding a less >>> expensive alternative would be helpful. >>> >>> mm <- matrix(rnorm(100), ncol = 10) >>> rn <- apply(mm, 1, prod) >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >-- >Peter Dalgaard, Professor, >Center for Statistics, Copenhagen Business School >Solbjerg Plads 3, 2000 Frederiksberg, Denmark >Phone: (+45)38153501 >Office: A 4.23 >Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > >