thr3ads.net - R devel - [Rd] Mere chat on vectorisation matters [May 2006]

If this information is useful, please help other people find it:
Share via:

François Pinard

2006-May-10 18:16 UTC

[Rd] Mere chat on vectorisation matters

Hi, people.  Allow me to chat a tiny bit on two vectorisation-related 
matters, in the context of R.  I'm curious about if the following ideas 
have ever been considered, and rejected already.

First is about using the so-called Duff's device for partially unrolling 
loops.  I did not overly check in R sources, and am not familiar with 
them anyway, but the only usage I saw is within
"src/gnuwin32/malloc.c".
Maybe it could be put to good usage in "src/main/arithmetic.c" and 
elsewhere.  Second is about what is called "chaining" on some vector 
computers, in which one vector operation uses, as an operand, the result 
of another vector operation, even before that result is sent for 
register or memory storage; R could use this technique for sparing 
memory, when it "knows" that the result is going to be discarded
anyway.

I used and abused Duff's device a good while ago, when I was working
in computer graphics; it was routinely used to speed up image-wide 
operations.  With a few properly devised C pre-processor macros, it was 
made easy to use (I thrown mine away a few years ago, recognizing I lost 
interest in low-level coding matters, the macros could easily be 
rethought anyway).  Questions existed at the time about unrolled loops 
fitting or not within specialised fetch-next-instruction caches of some 
CPUs, but nowadays, memory caches are much bigger then they used to be, 
I have the prejudice it is just not a problem anymore.  Maybe more of 
a concern might be the conditionals implementing vector recycling 
(already hidden in macros), as they may disrupt the speed of merely 
falling through linear code.  One might probably do without jumps using 
clever masking operations, yet I wonder how far we would safely 
benchmark at configuration time to decide best code to generate, and how 
good C would be to write masked conditionals.  I'm not familiar enough 
with modern CPUs to judge if this really needs to be addressed or not.

I would not doubt that hardware chaining is worth all the efforts the 
engineers put so the hardware recognises and activates it on the fly.  
Vectorised chaining implemented in software as a way to spare memory, 
may be much of a challenge, as it requires sort of half-compilation.  
One one hand, it might alleviate memory problems which are often the 
subject of discussions on R-help; through thrashing, going over real 
memory and into paging may considerably slow down an R application.  On 
the other hand, unless very carefully implemented, chaining overhead 
might slow down all non-thrashing applications, which is most of them.  
Nevertheless, being softer on memory requirements is already a concern 
in R, I vaguely remember having read that R "tries to prove" that 
a vector being modified will not needed anymore in its original form, 
and when the proof succeeds, the original vector gets modified without 
prior copying.  Chaining, despite difficult to implement, might be 
a significant further step, and so, be worth a discussion.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

François Pinard

2006-May-13 21:31 UTC

head link

[Rd] Mere chat on vectorisation matters

Hi to all.  Not so long ago, I wrote:
>Allow me to chat a tiny bit on two vectorisation-related matters, in 
>the context of R.  I'm curious about if the following ideas have ever 
>been considered, and rejected already. [... then, a few words about 
>Duff's device, and operation chaining ...]
As this letter did not generate any reply, I presumed the ideas have not 
be rejected, on the premise that if they have been, someone would have 
been kind enough to tell me :-).  So, today, I went forward and added 
Duff's device within arithmetic.c.

There might have been operational or experimental error, of course, as 
I'm far from mastering R installation matters.  But if there were no 
such error, I'm now reporting, for the record, that Duff's device does 
_not_ yield an interesting speedup.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

Possibly Parallel Threads

Search for more possibly parallel threads

R devel - May 2006 - Mere chat on vectorisation matters

[Rd] Mere chat on vectorisation matters

[Rd] Mere chat on vectorisation matters

Possibly Parallel Threads