Dear all, I am doing 200 times simulation. For each time, I generate a matrix and define some function on this matrix to get a 6 dimension vector as my results. As the loop should be slow, I generate 200 matrice first, and save them into a list named ma, then I define zz<-sapply(ma, myfunction) To my surprise, It almost costs me the same time to get my results if I directly use a loop from 1 to 200. Is it common? Can I improve any further? Ps, how to count the exact time to finish my code? Thanks. Zhen
Zhen,>how to count the exact time ?system.time(base) Returns CPU (and other) times that expr used. If you only need seconds, you can also do date();zz<-sapply(ma, myfunction);date() I do not know about how to reduce the time. For very comlex iterations, I use for( ) myself, which maybe inneficient. Mayeul KAUFFMANN Universit?? Pierre Mend??s France Grenoble France
You can use system.time() to time your procedure. There's no guarantee that sapply() will be faster than a for() loop, especially if you preallocate the matrices. -roger Zhen Pang wrote:> Dear all, > > I am doing 200 times simulation. For each time, I generate a matrix and > define some function on this matrix to get a 6 dimension vector as my > results. > > As the loop should be slow, I generate 200 matrice first, and save them > into a list named ma, > then I define zz<-sapply(ma, myfunction) > > To my surprise, It almost costs me the same time to get my results if I > directly use a loop from 1 to 200. Is it common? Can I improve any further? > > Ps, how to count the exact time to finish my code? > > Thanks. > > Zhen > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Roger D. Peng http://www.biostat.jhsph.edu/~rpeng/
Without seeing what myfunction is, it's almost impossible to tell. In addition to system.time(), you might want to profile your code. e.g., Rprof() zz <- sapply(ma, myfunction) Rprof(NULL) summaryRprof() HTH, Andy> From: Zhen Pang > > Dear all, > > I am doing 200 times simulation. For each time, I generate a > matrix and > define some function on this matrix to get a 6 dimension vector as my > results. > > As the loop should be slow, I generate 200 matrice first, and > save them into > a list named ma, > then I define zz<-sapply(ma, myfunction) > > To my surprise, It almost costs me the same time to get my > results if I > directly use a loop from 1 to 200. Is it common? Can I > improve any further? > > Ps, how to count the exact time to finish my code? > > Thanks. > > Zhen > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
Below is my code. myfunction is the myfunction I mentioned in my last email. p0<-.2 rho0<-.2 nl<-200 simu<-200 set.seed(135) setwd("d:/r") options(warn=1) ns<-rep(1,nl) configuration<-runif(nl) frequency<-c(0.0046,0.0057,0.0099,0.0139,0.0147,0.0148,0.0225,0.0321,0.0475,0.0766,0.1179,0.1529,0.1605,0.1424,0.0975,0.0542,0.0207,0.0086,0.0030) for (i in 1:nl) {if (configuration[i]<=frequency[1]) {ns[i]<-1 } else {for (j in 2:length(frequency)) {if (sum(frequency[1:(j-1)])<configuration[i] & configuration[i]<=sum(frequency[1:j])) {ns[i]<-j} } } } nu<-unique(ns) k<-max(nu) a<-vector(simu,mode="list") for (si in 1:simu) { print ("si") print (si) data<-c(0,0,0) for (iu in 1:length(nu)) {fr<-length(ns[ns==nu[iu]]) y<-rep(0,fr) for (ju in 1:fr) {y[ju]<-rbinom(1,nu[iu],rbeta(1,(1-rho0)*p0/rho0,(1-rho0)*(1-p0)/rho0)) } yu<-sort(unique(y)) yy<-rep(0,length(yu)) for (ku in 1:length(yu)) {yy[ku]<-length(y[y==yu[ku]]) } ma<-cbind(rep(nu[iu],length(yu)),yu,yy) data<-rbind(data,ma) } data<-data[-1,] a[[si]]<-data } myfunction<-function(data) { llb<-function(theta) { s <- apply(data, 1, function(data) { n<-data[1]; y<-data[2] ; re<-data[3] p<-1/(1+exp(-theta[1])) t <- exp(theta[2]) s <- log(choose(n,y)) r<-c(0:(n-1)) s <- s-sum(log(1+r*t)) if (n-y-1>=0) { r<-c(0:(n-y-1)) s <- s+sum(log(1-p+r*t)) } if (y-1>=0) {r<-c(0:(y-1)) s<- s+sum(log(p+r*t)) } s*re }) -sum(s) } est2<-optim(c(log(p0/(1-p0)),log(rho0/(1-rho0))),llb,hessian=T,control = list(maxit=5000000)) est2$par } zz<-sapply(a,myfunction) If we move the myfucntion to the for(si in 1:simu) loop, results are the same and there are no time spare. Can you improve a little? Thanks. Zhen>From: "Liaw, Andy" <andy_liaw at merck.com> >To: "'Zhen Pang'" <nusbj at hotmail.com>, r-help at stat.math.ethz.ch >Subject: RE: [R] sapply and loop >Date: Sat, 16 Oct 2004 08:23:54 -0400 > >Without seeing what myfunction is, it's almost impossible to tell. > >In addition to system.time(), you might want to profile your code. e.g., > >Rprof() >zz <- sapply(ma, myfunction) >Rprof(NULL) >summaryRprof() > >HTH, >Andy > > > From: Zhen Pang > > > > Dear all, > > > > I am doing 200 times simulation. For each time, I generate a > > matrix and > > define some function on this matrix to get a 6 dimension vector as my > > results. > > > > As the loop should be slow, I generate 200 matrice first, and > > save them into > > a list named ma, > > then I define zz<-sapply(ma, myfunction) > > > > To my surprise, It almost costs me the same time to get my > > results if I > > directly use a loop from 1 to 200. Is it common? Can I > > improve any further? > > > > Ps, how to count the exact time to finish my code? > > > > Thanks. > > > > Zhen > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! >http://www.R-project.org/posting-guide.html
On Sat, 16 Oct 2004, Zhen Pang wrote:> Dear all, > > I am doing 200 times simulation. For each time, I generate a matrix and > define some function on this matrix to get a 6 dimension vector as my > results. > > As the loop should be slow, I generate 200 matrice first, and save them into > a list named ma, > then I define zz<-sapply(ma, myfunction) > > To my surprise, It almost costs me the same time to get my results if I > directly use a loop from 1 to 200. Is it common? Can I improve any further? >It is quite common for a loop to be as fast as sapply(). After all, sapply() still has to run `myfunction' 200 times, and this is what takes most of the time, so there isn't any obvious reason why sapply() should be much faster. sapply() and lapply() certainly can be faster than loops, but usually not by very much. The surprising fact is that so many people *believe* the apply() functions are typically much faster than loops. It's probably a useful belief, since it encourages people to learn to use them (and they sometimes are faster). You should try using Rprof() to find out which parts of your code are slow. -thomas
I tried to use Rprof(). As an example, I consider the following code (from Venables & Ripley, 1999). library(MASS); library(boot); library(nls) data(stormer) storm.fm <- nls(Time ~ b*Viscosity/(Wt - c), stormer, start = c(b=29.401, c=2.2183)) st <- cbind(stormer, fit=fitted(storm.fm)) storm.bf <- function(rs, i) { st$Time <- st$fit + rs[i] tmp <- nls(Time ~ (b * Viscosity)/(Wt - c), st, start = coef(storm.fm)) tmp$m$getAllPars() } rs <- scale(resid(storm.fm), scale = FALSE) # remove the mean Rprof("boot.out") storm.boot <- boot(rs, storm.bf, R = 4999) # pretty slow Rprof(NULL) summaryRprof() Error in summaryRprof() : no events were recorded I am using R1.8.1 in windows. Why can't I get the results? Zhen>From: Thomas Lumley <tlumley at u.washington.edu> >To: Zhen Pang <nusbj at hotmail.com> >CC: r-help at stat.math.ethz.ch >Subject: Re: [R] sapply and loop >Date: Mon, 18 Oct 2004 09:14:49 -0700 (PDT) > >On Sat, 16 Oct 2004, Zhen Pang wrote: > >>Dear all, >> >>I am doing 200 times simulation. For each time, I generate a matrix and >>define some function on this matrix to get a 6 dimension vector as my >>results. >> >>As the loop should be slow, I generate 200 matrice first, and save them >>into a list named ma, >>then I define zz<-sapply(ma, myfunction) >> >>To my surprise, It almost costs me the same time to get my results if I >>directly use a loop from 1 to 200. Is it common? Can I improve any >>further? >> > >It is quite common for a loop to be as fast as sapply(). After all, >sapply() still has to run `myfunction' 200 times, and this is what takes >most of the time, so there isn't any obvious reason why sapply() should be >much faster. sapply() and lapply() certainly can be faster than loops, but >usually not by very much. > >The surprising fact is that so many people *believe* the apply() functions >are typically much faster than loops. It's probably a useful belief, since >it encourages people to learn to use them (and they sometimes are faster). > >You should try using Rprof() to find out which parts of your code are slow. > > -thomas
I am sorry for neglecting the acknowledgement for `Writing R Extensions', since I think I am just citing the code from the orignal R-help. I fail to get the results when I use my own code. So I refer to this code where Rprof() appears. Anyway, I am sorry for this. In fact, I have tried to whether I specify boot.out. Neither one works. Rprof("boot.out") storm.boot <- boot(rs, storm.bf, R = 4999) # pretty slow Rprof(NULL) summaryRprof() Error in summaryRprof() : no events were recorded summaryRprof("boot.out") Error in summaryRprof("boot.out") : no events were recorded Rprof() storm.boot <- boot(rs, storm.bf, R = 4999) # pretty slow Rprof(NULL) summaryRprof() Error in summaryRprof() : no events were recorded Zhen>From: Prof Brian Ripley <ripley at stats.ox.ac.uk> >To: Zhen Pang <nusbj at hotmail.com> >CC: tlumley at u.washington.edu, r-help at stat.math.ethz.ch >Subject: Re: [R] sapply and loop >Date: Tue, 19 Oct 2004 07:29:54 +0100 (BST) > >On Tue, 19 Oct 2004, Zhen Pang wrote: > > > I tried to use Rprof(). As an example, I consider the following code >(from > > Venables & Ripley, 1999). > >I believe you parroted that from `Writing R Extensions', but failed to >give proper credit! > > > library(MASS); library(boot); library(nls) > > data(stormer) > > storm.fm <- nls(Time ~ b*Viscosity/(Wt - c), stormer, > > start = c(b=29.401, c=2.2183)) > > st <- cbind(stormer, fit=fitted(storm.fm)) > > storm.bf <- function(rs, i) { > > st$Time <- st$fit + rs[i] > > tmp <- nls(Time ~ (b * Viscosity)/(Wt - c), st, > > start = coef(storm.fm)) > > tmp$m$getAllPars() > > } > > rs <- scale(resid(storm.fm), scale = FALSE) # remove the mean > > Rprof("boot.out") > > storm.boot <- boot(rs, storm.bf, R = 4999) # pretty slow > > Rprof(NULL) > >At this point your unacknowledged copying went adrift. > > > summaryRprof() > > Error in summaryRprof() : no events were recorded > > > > I am using R1.8.1 in windows. Why can't I get the results? > >Because you didn't do your homework, and didn't even follow your source. >The 'file' arguments of Rprof and summaryProf have to agree: see their >help pages (as the posting guide asks). > >-- >Brian D. Ripley, ripley at stats.ox.ac.uk >Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >University of Oxford, Tel: +44 1865 272861 (self) >1 South Parks Road, +44 1865 272866 (PA) >Oxford OX1 3TG, UK Fax: +44 1865 272595 > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! >http://www.R-project.org/posting-guide.html