thr3ads.net - R help - [R] long algo [Oct 2003]

If this information is useful, please help other people find it:
Share via:

william ritchie

2003-Oct-29 08:12 UTC

[R] long algo

Hi everyone,

 I ve been using R for months and find it really
practical and straight forward.
However (the inevitable however), I am finding it very
slow for one of my operations:
it s basically an itertation over i and j in a pretty
big table (4* 4608). It takes 30 minutes!!!!

Thanks


Ps:if it can help here is the source:

median1<-matrix(nrow=4608,ncol=1)
median2<-matrix(nrow=4608,ncol=1)
median3<-matrix(nrow=4608,ncol=1)
median4<-matrix(nrow=4608,ncol=1)
v<-c(18,19,20,21,23)
for (i in 0:11)
    {
     for (j in 1:384)
        {    
         
median1[j+(i*384),]<-puce[j+(i*384),5]+median(puce[v+384*i,2]-puce[v+384*i,5])
        
median2[j+(i*384),]<-puce[j+(i*384),19]+median(puce[v+384*i,16]-puce[v+384*i,19])
        
median3[j+(i*384),]<-puce[j+(i*384),12]+median(puce[v+384*i,9]-puce[v+384*i,12])
        
median4[j+(i*384),]<-puce[j+(i*384),26]+median(puce[v+384*i,23]-puce[v+384*i,26])
         
       
          puce[,5]<-median1
         puce[,19]<-median2
         puce[,12]<-median3
         puce[,26]<-median4


         }
     }

alessandro.semeria@cramont.it

2003-Oct-29 08:40 UTC

head link

[R] long algo

Is well know that R is inefficent  on loops.
When you have to perform "heavy" loop
is better to use a call to fortran or c code (.Fortran() , .C() functions)
A.S.

----------------------------

Alessandro Semeria
Models and Simulations Laboratory
Montecatini Environmental Research Center (Edison Group),
Via Ciro Menotti 48,
48023 Marina di Ravenna (RA), Italy
Tel. +39 544 536811
Fax. +39 544 538663
E-mail: alessandro.semeria at cramont.it

Uwe Ligges

2003-Oct-29 09:51 UTC

head link

[R] long algo

william ritchie wrote:
> Hi everyone,
> 
>  I ve been using R for months and find it really
> practical and straight forward.
> However (the inevitable however), I am finding it very
> slow for one of my operations:
> it s basically an itertation over i and j in a pretty
> big table (4* 4608). It takes 30 minutes!!!!
> 
> Thanks
There was a suggestion to use C or Fortran, but in your particular case 
it looks like you can choose a simpler way to get more performance (even 
if not that much as in C) by vectorizing a bit more, see below.


> Ps:if it can help here is the source:
> 
> median1<-matrix(nrow=4608,ncol=1)
> median2<-matrix(nrow=4608,ncol=1)
> median3<-matrix(nrow=4608,ncol=1)
> median4<-matrix(nrow=4608,ncol=1)
> v<-c(18,19,20,21,23)
> for (i in 0:11)
>     {
>      for (j in 1:384)
>         {    
>          
>
median1[j+(i*384),]<-puce[j+(i*384),5]+median(puce[v+384*i,2]-puce[v+384*i,5])
>         
>
median2[j+(i*384),]<-puce[j+(i*384),19]+median(puce[v+384*i,16]-puce[v+384*i,19])
>         
>
median3[j+(i*384),]<-puce[j+(i*384),12]+median(puce[v+384*i,9]-puce[v+384*i,12])
>         
>
median4[j+(i*384),]<-puce[j+(i*384),26]+median(puce[v+384*i,23]-puce[v+384*i,26])
>          
>        
>           puce[,5]<-median1
>          puce[,19]<-median2
>          puce[,12]<-median3
>          puce[,26]<-median4
> 
> 
>          }
>      }

The obvious (well, I haven't tried) *first* step (I don't want to 
rewrite your code here!) is, e.g.,


median1 <- median2 <- median3 <- median4 <- numeric(4608)
v <- c(18,19,20,21,23)
for (i in 0:11)
   {
  j <- 1:384
median1[j+(i*384)]<-puce[j+(i*384),5]+median(puce[v+384*i,2]-puce[v+384*i,5])
median2[j+(i*384)]<-puce[j+(i*384),19]+median(puce[v+384*i,16]-puce[v+384*i,19])
median3[j+(i*384)]<-puce[j+(i*384),12]+median(puce[v+384*i,9]-puce[v+384*i,12])
median4[j+(i*384)]<-puce[j+(i*384),26]+median(puce[v+384*i,23]-puce[v+384*i,26])
}
puce[,5]<-median1
puce[,19]<-median2
puce[,12]<-median3
puce[,26]<-median4


Uwe Ligges

Richard A. O'Keefe

2003-Oct-30 22:55 UTC

head link

[R] long algo

"Alessandro Semeria" alessandro.semeria at cramont.it wrote:
	Is well know that R is inefficent  on loops.

This is a dangerous half-truth.  R is an interpreted language.
The interpreter uses techniques similar to those used in Scheme
interpreters.  As interpreters go, it's pretty good.  For comparison,
in processing XML documents, I've had interpreted Scheme running rings
around compiled Java (by doing the task a different way, of course).
Also for comparison, years ago I had a Prolog program for median
polish that made a published Fortran program for median polish look
sick (by using a much better data structure).  With Luke Tierney's
byte-code compiler, I expect R loops will become close to as efficient
as Python ones, and people run entire web sites with Python.

It is more accurate to say that R code qua R code is not as efficient
as the large body of "primitives" that operate on entire arrays.

	When you have to perform "heavy" loop
	is better to use a call to fortran or c code (.Fortran() , .C() functions)

Even if the premiss were literally and exactly true, the conclusion
would not follow.  When you have a speed problem with R code,

(1) Find out where the problem is, exactly.  People's intuition about
    performance bottlenecks is notoriously bad.  Do what the experts do:
    *measure*.
(2) Try to restructure the code *entirely in R* to be as clear and high
    level as possible.  If there have to be subscripts, at least let them
    be vector subscripts.
(3) Measure again.  Chances are that making the code clear and high level
    has fixed the performance problem.
(4) If that fails, try restructuring the code a couple of ways,
    *entirely in R*.  The two basic techniques for optimising a calculation
    are (a) eliminate it entirely and (b) if you can't eliminate the first
    evaluation of an expression, eliminate the second by saving the result.
    As a special case of (b), try moving things out of loops; try splitting
    a calculation into a part that changes a lot and a part that changes
    very little, and update the small-change part only when you have to.
    Perhaps apply the idea of program differentiation.  (NOT the idea of
    taking a function that computes a value and automatically computing
    a function that computes the derivative of the first, but the idea of
    saying if I have z<-f(x,y) and I make a small change to x, do I have
    to recompute z completely or can I came a small change to z?)
    Try to use built in operations as much as possible on data structures
    that are as large as appropriate.
(5) Measure again.  This will probably have fixed the performance problem.
(6) If all else fails, now it's time to try Fortran or C.  It's too bad
    there isn't an existing Fortran or C module you can just call, if there
    had been you'd have used that before writing the original R code.

Maybe Matching Threads

Search for more reasonably related threads

R help - Oct 2003 - long algo

[R] long algo

[R] long algo

[R] long algo

[R] long algo

Maybe Matching Threads