thr3ads.net - R devel - [Rd] R-devel Digest, Vol 109, Issue 22 [Mar 2012]

If this information is useful, please help other people find it:
Share via:

Terry Therneau

2012-Mar-22 13:45 UTC

[Rd] R-devel Digest, Vol 109, Issue 22

>>>   strongly disagree. I'm appalled to see that sentence here.
>> >  
>> >  Come on!
>> >  
>>> >>  The overhead is significant for any large vector and it
is in particular unnecessary since in .C you have to allocate*and copy*  space
even for results (twice!). Also it is very error-prone, because you have no
information about the length of vectors so it's easy to run out of bounds
and there is no way to check. IMHO .C should not be used for any code written in
this century (the only exception may be if you are passing no data, e.g. if all
you do is to pass a flag and expect no result, you can get away with it even if
it is more dangerous). It is a legacy interface that dates way back and is
essentially just re-named .Fortran interface. Again, I would strongly recommend
the use of .Call in any recent code because it is safer and more efficient (if
you don't care about either attribute, well, feel free ;)).
>> >  
>> >  So aleph will not support the .C interface? ;-)
>> >  
> It will look at the timestamp of the source file and delete the package if
it is not before 1980 ;). Otherwise it will send a request for punch cards with
".C is deprecated, please upgrade to .Call" stamped out :P At that
point I'll be flaming about using the native Aleph interface and not the R
compatibility layer ;)
>
> Cheers,
> SI'll dissent -- I don't think .C is inherently any more dangerous than 
.Call and prefer it's simplicity in many cases.  Calling C at all is 
what is inherently dangerous -- I can reference beyond the end of a 
vector, write over objects that should be read only, and branch to 
random places using either interface.  If you are dealing with large 
objects and worry about memory efficiency then .Call puts more tools at 
your disposal and is worth the effort.  However, I did not find the 
.Call interface at all easy to use at first and we should keep that in 
mind before getting too pompous in our lectures to the "sinners of
.C".
(Mostly because the things I needed to know are scattered about in 
multiple places.)

I might have to ask for an exemption on that timestamp -- the first bits 
of the survival package only reach back to 1986.  And I've had to change 
source code systems multiple times which plays hob with the file times, 
though I did try to preserve the changelog history to forstall some 
future litigious soul who claims they wrote it first  (sccs -> rcs -> 
cvs -> svn -> mercurial).   :-)

Terry T

	[[alternative HTML version deleted]]

Simon Urbanek

2012-Mar-22 14:38 UTC

head link

[Rd] R-devel Digest, Vol 109, Issue 22

On Mar 22, 2012, at 9:45 AM, Terry Therneau <therneau@mayo.edu> wrote:
> 
>> 
>>> 
>>>>  strongly disagree. I'm appalled to see that sentence here.
>>> > 
>>> > Come on!
>>> > 
>>>> >> The overhead is significant for any large vector and
it is in particular unnecessary since in .C you have to allocate *and copy*
space even for results (twice!). Also it is very error-prone, because you have
no information about the length of vectors so it's easy to run out of bounds
and there is no way to check. IMHO .C should not be used for any code written in
this century (the only exception may be if you are passing no data, e.g. if all
you do is to pass a flag and expect no result, you can get away with it even if
it is more dangerous). It is a legacy interface that dates way back and is
essentially just re-named .Fortran interface. Again, I would strongly recommend
the use of .Call in any recent code because it is safer and more efficient (if
you don't care about either attribute, well, feel free ;)).
>>> > 
>>> > So aleph will not support the .C interface? ;-)
>>> > 
>> It will look at the timestamp of the source file and delete the package
if it is not before 1980 ;). Otherwise it will send a request for punch cards
with ".C is deprecated, please upgrade to .Call" stamped out :P At
that point I'll be flaming about using the native Aleph interface and not
the R compatibility layer ;)
>> 
>> Cheers,
>> S
> I'll dissent -- I don't think .C is inherently any more dangerous
than .Call and prefer it's simplicity in many cases.  Calling C at all is
what is inherently dangerous -- I can reference beyond the end of a vector,
write over objects that should be read only, and branch to random places using
either interface.
You can always do so deliberately, but with .C you have no way of preventing it
since you don't even know what is the length! That is certainly far more
dangerous than .Call where you can simply loop over the length, check that the
lengths are compatible etc. Also for types like strings .C is a minefield that
is hard to not blow up whereas .Call it is even more safe than scalar arrays.
You can do none of that with .C which relies entirely on conventions with no
recorded semantics.

> If you are dealing with large objects and worry about memory efficiency
then .Call puts more tools at your disposal and is worth the effort.  However, I
did not find the .Call interface at all easy to use at first
I guess this depends on the developer and is certainly a factor. Personally, I
find the subset of the R API needed for .Call fairly small and intuitive (in
particular when you are just writing a safer replacement for .C), but I'm
obviously biased. Maybe in a separate thread we could discuss this - I'd be
happy to write a ref card or cheat sheet if I find out what people find
challenging on .Call. Nonetheless, my point is that it is more than worth
investing the effort both in safety and performance.

> and we should keep that in mind before getting too pompous in our lectures
to the "sinners of .C".  (Mostly because the things I needed to know
are scattered about in multiple places.)
> 
> I might have to ask for an exemption on that timestamp -- the first bits of
the survival package only reach back to 1986.  And I've had to change source
code systems multiple times which plays hob with the file times, though I did
try to preserve the changelog history to forstall some future litigious soul who
claims they wrote it first  (sccs -> rcs -> cvs -> svn ->
mercurial).   :-)
> 
;) Maybe the rule should be based on the date of the first appearance of the
package, fair enough :)

Cheers,
Simon
	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more possibly parallel threads

R devel - Mar 2012 - R-devel Digest, Vol 109, Issue 22

[Rd] R-devel Digest, Vol 109, Issue 22

[Rd] R-devel Digest, Vol 109, Issue 22

Reasonably Related Threads