Ajay Shah <ajayshah <at> mayin.org> writes:
>
> Here's a small R program:
>
> ---------------------------------------------------------------------------
> a <- rep(1,10000000)
>
> system.time(a <- a + 1)
>
> system.time(for (i in 1:10000000) {a[i] <- a[i] + 1})
> ---------------------------------------------------------------------------
>
> and here's its matlab version:
>
> ---------------------------------------------------------------------------
> function forloop()
>
> a = ones(1e7,1);
>
> tic; a = a + 1; toc
>
> tic
> for i=1:1e7
> a(i) = a(i) + 1;
> end
> toc
> ---------------------------------------------------------------------------
>
> The machine used for testing was an i386 core 2 duo machine at 2.2
> GHz, running OS X `Tiger'. The R was 2.8.0 and the matlab was 2008a.
>
> Here's the performance (time taken in seconds):
>
> Matlab R Times
> Vector version 0.0446 0.067 1.5x
> For loop version 0.0992 42.209 425.5x
>
> So the R is 1.5x costlier for the vector version and 425.5x costlier
> with matlab.
>
> I wonder what we're doing wrong!
OK, I'll bite.
1. are you sure that the difference in the vector version is
real? On my computer the R system time for the
vector computation ranges from 0.052 to 0.1.
I would definitely check how much variation there is.
2. Every language has different advantages and disadvantages.
I believe Matlab does a lot of byte-code compiling. If you're
interested in this sort of thing, try Ra
<http://www.milbo.users.sonic.net/ra/index.html> , a version
of R that allows similar speed-ups.
3. I appreciate that every language can be improved, but this
feels like something that is not "fixable" without changing
the language significantly. You have lots of options if you need
to do these kinds of operations -- Matlab (if you can afford it),
Octave (does it run as quickly as Matlab?), programming in C
(see http://modelingwithdata.org/ ), coding the critical
non-vectorizable bits of your algorithm in C or FORTRAN, etc. etc.
etc..
cheers
Ben Bolker