Thanks Duncan -- forwarding to the list as this cautionary tale seems very
helpful! Cheers, Jonathan.
Jonathan Rougier Science Laboratories
Department of Mathematical Sciences South Road
University of Durham Durham DH1 3LE
http://www.maths.dur.ac.uk/stats/people/jcr/jcr.html
---------- Forwarded message ----------
Date: Thu, 7 Sep 2000 10:42:23 -0400 (EDT)
From: Duncan Temple Lang <duncan at research.bell-labs.com>
To: J.C.Rougier at durham.ac.uk
Subject: Re: [R] .C and DUP=TRUE versus .Call
> Date: Thu, 7 Sep 2000 13:29:21 +0100 (BST)
> From: Jonathan Rougier <J.C.Rougier at durham.ac.uk>
> Sender: owner-r-help at stat.math.ethz.ch
> Precedence: bulk
>
> Hi Everyone,
>
> I have a piece of C code that uses R_alloc, and so I set DUP=TRUE in the
> call using ".C". As I understand it this takes a copy of each
object
> passed to my function. If these objects are large then this could be
> expensive. My question is, if I rewrote the code to use .Call, would I
> avoid this duplication by using the objects themselves (they are not
> modified in the code) rather than copies?
>
> Many thanks, Jonathan.
>
> Jonathan Rougier Science Laboratories
> Department of Mathematical Sciences South Road
> University of Durham Durham DH1 3LE
> http://www.maths.dur.ac.uk/stats/people/jcr/jcr.html
>
As far as I can see, you are correct in thinking that the .Call will
not copy the R objects that are passed to it. That will avoid the
duplication issue.
However, there are other issues that you must be aware of in using the
.Call(). Each argument to the C routine will be an R object declared
in C as a SEXP. You can think of this as being an object that contains
information that R needs to understand what type of data is "in" the
object and a pointer to the data itself (e.g. an array of numbers).
In the .C, the numbers are given to you as a C-level array of doubles,
i.e. double *.
In the .Call(), they might be given to you as an argument
SEXP x
and you can access the individual values as
NUMERIC_POINTER(x)[i].
But you need to be very careful if you were to do the following
double *vals = NUMERIC_POINTER(x);
....
vals[i]
It is possible that the R engine will need to move the
numbers somewhere else during garbage collection. If so, R
will update the pointer in the SEXP x to refer to this new
location of the values. But R cannot know to update vals.
In such cases, vals would be pointing to the old location
of the data and all sorts of "interesting" things can happen.
And if they can, they will :-)
Thus, it is _much_ safer to always use
NUMERIC_POINTER(x)[i]
2) When a character vector ( c("a", "b", "cde"))
is passed via
a .Call() to a C routine, it is not in the form of a char **
and to get the i-th string in the vector, you need to do
CHAR( CHARACTER_POINTER(x)[i] )
In spite of these, writing code for use with .Call() is quite easy and
fun once you get the hang of these and a few other ideas.
I hope this helps. If you think it is comprehensible and
may help others feel free to post it back to the r-help list.
D.
--
_______________________________________________________________
Duncan Temple Lang duncan at research.bell-labs.com
Bell Labs, Lucent Technologies office: (908)582-3217
700 Mountain Avenue, Room 2C-259 fax: (908)582-3340
Murray Hill, NJ 07974-2070
http://cm.bell-labs.com/stat/duncan
"Languages shape the way we think, and determine what
we can think about."
Benjamin Whorf
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._