(I dithered a bit about whether this belongs on r-help (as part of it is a general R question) or r-devel (as it's a question relating to putting stuff on CRAN) but decided it might be of general enough interest to go on the former ) I am currently preparing an R library to estimate approximate posterior distributions for parameters in Generalised Linear Mixed Models by Gibbs Sampling which includes a lot of dynamically loaded C code. I'm pleased with the way in which it is working and am now hammering out some documentation, but closer consideration of some R documentation has led me to worry that my code a) may not achieve certain standards of R and C programming and b) may not work on all systems. (I'm using 0.64.1 on a UNIX system) My worries are a) That I've used calloc rather than S_alloc throughout. b) That I've used the following technique: I've calls to different C programs held together in a single file which looks like: GLM g; /* information about the model */ OUTPUT o; /* information about the output * f1(g){ ....... } f2(g,o) { ......} (GLM and OUTPUT are typedefs defined in a .h file) with an R function that looks like : { <R statements> .C("f1",.....) <R statements> .C("f2",.....) <R statements> ... } and I rely on the fact that "g" and "o" remain the same between the successive calls (there's a C call at the end to free up all the memory). is this: 1) bad, 2) so bad (-;) that the R archive wouldn't really be interested in code that worked in this way? I *think* I can do it without using this technique, but it will need a lot of re-programming. One way round it in S v5 uses the new .Call() function---is anything similar on the agenda for future versions of R? On another note, where in the library structure would be a good place to put a LaTeX document describing the function and showing some examples? Thanks, Jonathan -- Dr. Jonathan Myles e-mail:jonathan.myles at mrc-bsu.cam.ac.uk MRC Biostatistics Unit Tel. 01223 330371 Institute of Public Health FAX 01223 330388 University Forvie Site Robinson Way CAMBRIDGE CB2 2SR -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Thu, 27 May 1999, Jonathan Myles wrote:> > I am currently preparing an R library to estimate > approximate posterior distributions for parameters in > Generalised Linear Mixed Models by Gibbs Sampling > which includes a lot of dynamically loaded C code. > I'm pleased with the way in which it is working and am > now hammering out some documentation, but closer consideration > of some R documentation has led me to worry that my code > a) may not achieve certain standards of R and C programming > and b) may not work on all systems. (I'm using 0.64.1 > on a UNIX system) > > My worries are > > a) That I've used calloc rather than S_alloc throughout. > > b) That I've used the following technique: I've > calls to different C programs held together in a single > file which looks like: ><snip>> > and I rely on the fact that "g" and "o" remain the same between the successive > calls (there's a C call at the end to free up all the memory). is this: > > 1) bad, > 2) so bad (-;) that the R archive wouldn't really be interested in code > that worked in this way?The same technique is used in the survival5 library, which seems to work on a number of platforms. I think it's undesirable, but not disastrous. Basically, if you do (b) you have to do (a) since R does not preserve S_alloc memory across calls (I believe this differs from S-PLUS) The two reasons why it's undesirable are: 1. it means that R can use much more memory than the specified heap sizes, making it harder to tell how big these can safely be set without swapping. 2. If the program is interrupted (ctrl-c) between calloc() and free() it will leak memory. The second problem should be fixable with on.exit().> I *think* I can do it without using this technique, but it will need a lot > of re-programming. One way round it in S v5 uses the new .Call() > function---is anything similar on the agenda for future versions of R?One way around this problem is to use the new and almost undocumented .External() interface, which allows you to write "internal" R commands and allocate memory on the R heap. This can save a lot of copying and gives you more control over the allocation of memory. The disadvantages are that the API is unstable and largely undocumented, and that it gives you more control over the allocation of memory (so you need to be aware of things like garbage collection).> On another note, where in the library structure would be a good > place to put a LaTeX document describing the function and showing > some examples?I would put it at the top level, or possibly in a doc/ subdirectory. I don't think there are even informal standards on this. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jonathan Myles <jonathan.myles at mrc-bsu.cam.ac.uk> writes:> I am currently preparing an R library to estimate approximate > posterior distributions for parameters in Generalised Linear Mixed > Models by Gibbs Sampling which includes a lot of dynamically loaded > C code. I'm pleased with the way in which it is working and am now > hammering out some documentation, but closer consideration of some R > documentation has led me to worry that my code a) may not achieve > certain standards of R and C programming and b) may not work on all > systems. (I'm using 0.64.1 on a UNIX system)> My worries are > > a) That I've used calloc rather than S_alloc throughout.Using calloc and free in C code loaded by R is fine. We would recommend that you use the macros Calloc and Free defined in the S.h include file (the file is named that for historical purposes). Those macros check if the allocation was successful and provide convenient coersion of the type of the returned storage. [discussion of a particular programming technique omitted]> I *think* I can do it without using this technique, but it will need a lot > of re-programming. One way round it in S v5 uses the new .Call() > function---is anything similar on the agenda for future versions of R?You are in luck. There is a facility in R called .External which is much like the .Call facility in S-PLUS 5.x. Saikat DebRoy has written some documentation on that. It would be in the development tree now except that I was going to a bit of editing on it. I can send you a preview copy if you like. Please tell me if you can handle texinfo format.> On another note, where in the library structure would be a good > place to put a LaTeX document describing the function and showing > some examples?If you haven't already done so, please read the output from the R call help(library) That contains information on how to structure your own libraries. The preferred way to write the documentation is in a LaTeX-like formulation called the R documentation (.Rd) format. Those files are put in the "man" subdirectory of your library. R INSTALL <libname> then converts those files to html, LaTeX, and ASCII formats when it installs the library. Furthermore it strips out the examples that you use in there so you can, for example, do things like library(MASS) example(boxcox) to run the example. Furthermore, you can use R CMD check <libname> to run all the examples in your library - a good way of testing that changes you have made in your code have not botched things up in unsuspected ways. In computing this is known as regression testing but the name "regression" there does not refer to the statistical concept of regression modeling. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Thu, 27 May 1999, Jonathan Myles wrote:> I am currently preparing an R library to estimate > approximate posterior distributions for parameters in > Generalised Linear Mixed Models by Gibbs Sampling > which includes a lot of dynamically loaded C code.If this is a version of your S glmm codes, we very much look forward to it.> I'm pleased with the way in which it is working and am > now hammering out some documentation, but closer consideration > of some R documentation has led me to worry that my code > a) may not achieve certain standards of R and C programming > and b) may not work on all systems. (I'm using 0.64.1 > on a UNIX system) > > My worries are > > a) That I've used calloc rather than S_alloc throughout.Just use Calloc and Free: these are defined in S.h and (a) let R do the error-checking and (b) on Windows ensures you get a malloc that can actually free space.> b) That I've used the following technique: I've > calls to different C programs held together in a single > file which looks like: > > GLM g; /* information about the model */ > OUTPUT o; /* information about the output * > > f1(g){ ....... } > f2(g,o) { ......} > > > > (GLM and OUTPUT are > typedefs defined in a .h file) > > > with an R function that looks like : > { > <R statements> > > .C("f1",.....) > > <R statements> > > .C("f2",.....) > > <R statements> > > ... > > } > > and I rely on the fact that "g" and "o" remain the same between the successive > calls (there's a C call at the end to free up all the memory). is this: > > 1) bad, > 2) so bad (-;) that the R archive wouldn't really be interested in code > that worked in this way?I am not clear about this. Do you rely on g and o remaining the same or their addresses remaining the same? And if the latter, you don't need to pass g in again. Garbage collection will move objects, but nothing in R will alter them. So on my second reading of this, there is no problem. You might want to try us on a more specific example.> I *think* I can do it without using this technique, but it will need a lot > of re-programming. One way round it in S v5 uses the new .Call() > function---is anything similar on the agenda for future versions of R?(It is S-PLUS 5.x or Sv4, I think.) There is already a .External call for a very similar purpose. I find programming R internals a little easier than .Call().> On another note, where in the library structure would be a good > place to put a LaTeX document describing the function and showing > some examples?Anything you put in inst gets installed, so I would suggest in inst/doc. By no means all R users have LaTeX, so give a .ps or a .pdf version too, please. Brian Ripley -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._