Vadim Ogranovich
2005-May-04 05:53 UTC
[Rd] Cost of method dispatching: was: when can we expect Prof Tierney's compiled R?
> -----Original Message----- > From: Prof Brian Ripley [mailto:ripley@stats.ox.ac.uk] > Sent: Wednesday, April 27, 2005 1:13 AM > To: Vadim Ogranovich > Cc: Luke Tierney; r-devel@stat.math.ethz.ch > Subject: Re: [Rd] RE: [R] when can we expect Prof Tierney's > compiled R? > > On Tue, 26 Apr 2005, Vadim Ogranovich wrote: >...> > The arithmetic shows that x[i]<- is still the bottleneck. I suspect > > that this is due to a very involved dispatching/search for the > > appropriate function on the C level. There might be > significant gain > > if loops somehow cached the result of the initial > dispatching. This is > > what you probably referred to as additional improvements in > the engine itself. > > I'd be surprised if dispatching were the issue: have you > (C-level) profiled to find out? Please do so: these > statements do tend to get perpetuated as fact.For the record, I didn't profile the dispatching, so it is only my guess and is not verified by C-level profiling. The guess is based on reading the code and on the following timing on R level:> n = 1e6; iA = seq(2,n); x = double(n); > f1 <- function(x, iA) for (i in iA) x[i] = c(1.0) > f2 <- function(x, iA) for (i in iA) x = c(1.0) > last.gc.time = gc.time(TRUE) > system.time(f1(x, iA), gcFirst=TRUE)[1] 3.50 0.01 3.52 0.00 0.00> print(gc.time() - last.gc.time); last.gc.time = gc.time()[1] 1.25 0.82 1.24 0.00 0.00> system.time(f2(x, iA), gcFirst=TRUE)[1] 0.76 0.00 0.77 0.00 0.00> print(gc.time() - last.gc.time); last.gc.time = gc.time()[1] 0.25 0.18 0.23 0.00 0.00 f1 and f2 are identical except that the first assigns to an element in the vector (and thus goes through the method dispatching). Originally I had thought that the number of allocations in f1 and in f2 must be the same, the c(1.0) call. But gc.time() shows that the number of allocations in f1 is indeed, as Prof. Ripley suggests, bigger than in f2. It is not clear to me where these extra allocations come from and whether they are necessary. All x[i] = c(1.0) needs to do is to create a new vector c(1.0), which is a step common between f1 and f2, and then copy from the vector into x[i]. However even after discounting for gc.time the assignment to x[i] seems to be heavy.> > You cannot cache the result, as [<- can change the class of > x, as could other operations done by the rhs (e.g. if it were > x[i] <- g(x, i) the function g could change its argument).Yes, however R may try to use the last method found and only when that fails go for the full dispatch. This should give a lot of gain in a typical case when the vars. types do not change.
Duncan Murdoch
2005-May-04 10:17 UTC
[Rd] Cost of method dispatching: was: when can we expect Prof Tierney's compiled R?
Vadim Ogranovich wrote:> > > >>-----Original Message----- >>From: Prof Brian Ripley [mailto:ripley@stats.ox.ac.uk] >>Sent: Wednesday, April 27, 2005 1:13 AM >>To: Vadim Ogranovich >>Cc: Luke Tierney; r-devel@stat.math.ethz.ch >>Subject: Re: [Rd] RE: [R] when can we expect Prof Tierney's >>compiled R? >> >>On Tue, 26 Apr 2005, Vadim Ogranovich wrote: >> > > ... > >>>The arithmetic shows that x[i]<- is still the bottleneck. I suspect >>>that this is due to a very involved dispatching/search for the >>>appropriate function on the C level. There might be >> >>significant gain >> >>>if loops somehow cached the result of the initial >> >>dispatching. This is >> >>>what you probably referred to as additional improvements in >> >>the engine itself. >> >>I'd be surprised if dispatching were the issue: have you >>(C-level) profiled to find out? Please do so: these >>statements do tend to get perpetuated as fact. > > > > For the record, I didn't profile the dispatching, so it is only my guess > and is not verified by C-level profiling. > > The guess is based on reading the code and on the following timing on R > level: > >>n = 1e6; iA = seq(2,n); x = double(n); >>f1 <- function(x, iA) for (i in iA) x[i] = c(1.0) >>f2 <- function(x, iA) for (i in iA) x = c(1.0) >>last.gc.time = gc.time(TRUE) >>system.time(f1(x, iA), gcFirst=TRUE) > > [1] 3.50 0.01 3.52 0.00 0.00 > >>print(gc.time() - last.gc.time); last.gc.time = gc.time() > > [1] 1.25 0.82 1.24 0.00 0.00 > >>system.time(f2(x, iA), gcFirst=TRUE) > > [1] 0.76 0.00 0.77 0.00 0.00 > >>print(gc.time() - last.gc.time); last.gc.time = gc.time() > > [1] 0.25 0.18 0.23 0.00 0.00 > > f1 and f2 are identical except that the first assigns to an element in > the vector (and thus goes through the method dispatching). > > Originally I had thought that the number of allocations in f1 and in f2 > must be the same, the c(1.0) call. But gc.time() shows that the number > of allocations in f1 is indeed, as Prof. Ripley suggests, bigger than in > f2. It is not clear to me where these extra allocations come from and > whether they are necessary. All x[i] = c(1.0) needs to do is to create a > new vector c(1.0), which is a step common between f1 and f2, and then > copy from the vector into x[i]. > > However even after discounting for gc.time the assignment to x[i] seems > to be heavy. > > >>You cannot cache the result, as [<- can change the class of >>x, as could other operations done by the rhs (e.g. if it were >>x[i] <- g(x, i) the function g could change its argument). > > > Yes, however R may try to use the last method found and only when that > fails go for the full dispatch. This should give a lot of gain in a > typical case when the vars. types do not change.There are probably efficiency improvements available, but they need to be done very carefully. For example, the default method of [<- could be called in one step, and as a side effect create a more specific method. So for the second call we should call the more specific one, but the default call will still be valid in the sense that the arguments match the signature (S4) or the class matches the name (S3), but not in the sense that it is the method that should be called. Duncan Murdoch
Possibly Parallel Threads
- RE: [R] when can we expect Prof Tierney's compiled R?
- RE: [R] when can we expect Prof Tierney's compiled R?
- Light-weight data.frame class: was: how to add method to .Primitive function
- when can we expect Prof Tierney's compiled R?
- clean-up actions after non-local exits