thr3ads.net - R help - [R] Why is looping in R inefficient, but in C not? [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Alexander Engelhardt

2011-Jun-26 06:56 UTC

[R] Why is looping in R inefficient, but in C not?

Hey,
I just read another post about calling R from C. Someone on 
stackoverflow (DWin makes me suspect its David W.?) referenced this: 
http://www.math.univ-montp2.fr/~pudlo/R_files/call_R.pdf
Which made me think: Why is a loop in R bad, but in C not?

And where exactly does looping cost the most? I wrote a piece of code 
for my bachelor's thesis where I loop from 1 to 500, and estimate a 
boosted model in every iteration. The procedure takes 2-6 minutes. In 
this example the loop (instead of some kind of apply()) shouldn't cost 
too much time, right?
I suspect it's way worse if someone would loop from 1 to 10000 and 
perform only a small task (a mean(), for example) in each loop. Can 
someone confirm this?

Regards,
  Alex

Jeff Newmiller

2011-Jun-26 07:21 UTC

head link

[R] Why is looping in R inefficient, but in C not?

For the same reason the Cray XMP was fast at numerical computations... a loop
written in a low level language can be optimized to work faster than one written
in a higher level language. The XMP optimized loops into hardware, but R just
optimizes them in C code, exposed to the R programmer as vector operations.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

Alexander Engelhardt <alex@chaotic-neutral.de> wrote:

Hey,
I just read another post about calling R from C. Someone on 
stackoverflow (DWin makes me suspect its David W.?) referenced this: 
http://www.math.univ-montp2.fr/~pudlo/R_files/call_R.pdf
Which made me think: Why is a loop in R bad, but in C not?

And where exactly does looping cost the most? I wrote a piece of code 
for my bachelor's thesis where I loop from 1 to 500, and estimate a 
boosted model in every iteration. The procedure takes 2-6 minutes. In 
this example the loop (instead of some kind of apply()) shouldn't cost 
too much time, right?
I suspect it's way worse if someone would loop from 1 to 10000 and 
perform only a small task (a mean(), for example) in each loop. Can 
someone confirm this?

Regards,
Alex

_____________________________________________

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


	[[alternative HTML version deleted]]

David Winsemius

2011-Jun-26 14:00 UTC

head link

[R] Why is looping in R inefficient, but in C not?

On Jun 26, 2011, at 2:56 AM, Alexander Engelhardt wrote:
> Hey,
> I just read another post about calling R from C. Someone on  
> stackoverflow (DWin makes me suspect its David W.?) referenced this:
http://www.math.univ-montp2.fr/~pudlo/R_files/call_R.pdf
> Which made me think: Why is a loop in R bad, but in C not?
I do not think the cited authority provides any support to that  
notion. It rather suggests that things which might benefit from using  
a compiler can be fairly easily passed to C.>
> And where exactly does looping cost the most? I wrote a piece of  
> code for my bachelor's thesis where I loop from 1 to 500, and  
> estimate a boosted model in every iteration. The procedure takes 2-6  
> minutes. In this example the loop (instead of some kind of apply())  
> shouldn't cost too much time, right?
> I suspect it's way worse if someone would loop from 1 to 10000 and  
> perform only a small task (a mean(), for example) in each loop. Can  
> someone confirm this?
_You_ can investigate it. I cannot determine from your statements what  
expectations you have for an apply-vs-loop test, so I am not sure if  
this is confirming or disproving:

z2 <- z <- vector("numeric", 10000)
  x <- matrix(1:100, 10000,20)
  aloop1 <- Sys.time(); z<-apply(x,1, mean); difftime( Sys.time(),  
aloop1)
  aloop2 <- Sys.time(); for (i in 1:10000) {z2[i] <- mean(x[i,]) } ;  
difftime( Sys.time(), aloop2)
  identical(z, z2)

Probably not in line with your current understanding. I wonder whether  
the trivial advantage offered by apply (due to the single assignment I  
suspect) is in line with you understanding. Most of the efficiency in  
apply operations are at the level of clarity of the code and ease of  
use. The maximal efficiency gains are to use the proper vectorized  
operations that can be 50-100 times faster:

 >  aloop3 <- Sys.time(); z3 <- rowMeans(x)  ; difftime( Sys.time(),  
aloop3)
Time difference of 0.01409197 secs
 >  identical(z, z3)
[1] TRUE

Other efficincy strategies are to pre-allocate structures of known  
size and avoid using c, cbind or rbind operatiosn to accumulate  
results in a loop
-- 
David Winsemius, MD
West Hartford, CT

Seemingly Similar Threads

Search for more maybe matching threads

R help - Jun 2011 - Why is looping in R inefficient, but in C not?

[R] Why is looping in R inefficient, but in C not?

[R] Why is looping in R inefficient, but in C not?

[R] Why is looping in R inefficient, but in C not?

Seemingly Similar Threads