This is a question and maybe an announcement. We've been discussing in the group that it would be nice to have a mechanism for something like "inline" C/C++ function calls in R. I do not want to reinvent the wheel, therefore, if something like that already exists, please give me a hint -- I could not find anything. If not, here is a working solution, please criticise so I could improve it. Example: I work on images (Bioconductor:EBImage) and just came to a point where I need to apply certain functions to image data, which are grey scale intensities in the range [0,1]. Assume I want to transform my image data from I(x,y,i) to exp(-(d/s)^2)*I(x,y,i), where I is the original intensity in dependence on coordinates x,y and frame i; s is a given value and d^2=(x-centre.x)^2+(y-centre.y)^2 for a given centre. Trying an R loop will run forever already on moderate image sizes as I do not see how to vectorize it. Now, below is the solution using the "inline" C code, completely in R and runs instantly. I created a small package "inline" that simply encapsulates a quite simple function "cfunction". The package source is available from http://www.ebi.ac.uk/~osklyar/inline -- please give it a try and I would be happy to hear your comments, both on already existing implementations for "inline" calls and on the current one. It is a draft and I will be happy to improve it (especially comments on how to ensure that the shared object is unloaded when the function is removed). In the example below EBImage is not required, I use randomly generated values instead of images, but the output it quite obvious. After installing "inline" the example should just work by copy-pasting. Best and thanks in advance, Oleg code <- character(17) code[1] <- " SEXP res;" code[2] <- " int nprotect = 0, nx, ny, nz, x, y;" code[3] <- " PROTECT(res = Rf_duplicate(a)); nprotect++;" code[4] <- " nx = INTEGER(GET_DIM(a))[0];" code[5] <- " ny = INTEGER(GET_DIM(a))[1];" code[6] <- " nz = INTEGER(GET_DIM(a))[2];" code[7] <- " double sigma2 = REAL(s)[0] * REAL(s)[0], d2 ;" code[8] <- " double cx = REAL(centre)[0], cy = REAL(centre)[1], *data, *rdata;" code[9] <- " for (int im = 0; im < nz; im++) {" code[10] <- " data = &(REAL(a)[im*nx*ny]); rdata &(REAL(res)[im*nx*ny]);" code[11] <- " for (x = 0; x < nx; x++)" code[12] <- " for (y = 0; y < ny; y++) {" code[13] <- " d2 = (x-cx)*(x-cx) + (y-cy)*(y-cy);" code[14] <- " rdata[x + y*nx] = data[x + y*nx] * exp(-d2/sigma2);" code[15] <- " }}" code[16] <- " UNPROTECT(nprotect);" code[17] <- " return res;" library(inline) funx <- cfunction(signature(a="array", s="numeric", centre="numeric"), code) x <- array(runif(50*50), c(50,50,1)) res <- funx(a=x, s=15, centre=c(30,35)) image(res[,,1]) res <- funx(x, 10, c(15,15)) x11(); image(res[,,1]) -- Dr Oleg Sklyar * EBI/EMBL, Cambridge CB10 1SD, England * +44-1223-494466
Oleg, On May 22, 2007, at 1:59 PM, Oleg Sklyar wrote:> We've been discussing in the group that it would be nice to have a > mechanism for something like "inline" C/C++ function calls in R. I > do not want to reinvent the wheel, therefore, if something like > that already exists, please give me a hint -- I could not find > anything. If not, here is a working solution, please criticise so I > could improve it. > > Example: I work on images (Bioconductor:EBImage) and just came to a > point where I need to apply certain functions to image data, which > are grey scale intensities in the range [0,1]. Assume I want to > transform my image data from I(x,y,i) to exp(-(d/s)^2)*I(x,y,i), > where I is the original intensity in dependence on coordinates x,y > and frame i; s is a given value and d^2=(x-centre.x)^2+(y-centre.y) > ^2 for a given centre. Trying an R loop will run forever already on > moderate image sizes as I do not see how to vectorize it. >That is actually a (rare) case that can be completely vectorized: d=(cx-rep(1:dim(I)[1],dim(I)[2]*dim(I)[3]))^2+(cy-rep(1:dim(I) [2],each=dim(I)[1],times=dim(I)[3]))^2 I=I*exp(-(d/s^2)) Clearly the drawback is the use of memory, but you could vectorize per frame if you wish. At any rate it's not that slow anyway: > I=array(runif(100*100*10),c(100,100,10)) > system.time({d=(cx-rep(1:dim(I)[1],dim(I)[2]*dim(I)[3]))^2+(cy-rep (1:dim(I)[2],each=dim(I)[1],times=dim(I)[3]))^2;I=I*exp(-(d/s^2))}) user system elapsed 0.022 0.010 0.032 > system.time(funx(I,15,c(30,35))) user system elapsed 0.008 0.001 0.010 Of course C wins, no doubt about that :).> Now, below is the solution using the "inline" C code, completely in > R and runs instantly. I created a small package "inline" that > simply encapsulates a quite simple function "cfunction". The > package source is available from http://www.ebi.ac.uk/~osklyar/ > inline -- please give it a try and I would be happy to hear your > comments, both on already existing implementations for "inline" > calls and on the current one.I really like the idea! Except for the fact that it's forcing the use of C++ which adds unnecessary overhead :P I'd like a configurable extension [including .m] and the ability to prepend functions as code. What would be very useful now is a set of tools that would allow you to construct the source with R commands, so that you could compute on it, edit it etc. That would be really cool ... you could even imagine compiling a very restricted set of R into C code ... yes, I know I'm dreaming ;) Thanks for the good idea! Cheers, Simon
On 5/22/2007 1:59 PM, Oleg Sklyar wrote:> This is a question and maybe an announcement. > > We've been discussing in the group that it would be nice to have a > mechanism for something like "inline" C/C++ function calls in R. I do > not want to reinvent the wheel, therefore, if something like that > already exists, please give me a hint -- I could not find anything. If > not, here is a working solution, please criticise so I could improve it.This would be nice. One suggestion that probably doesn't affect your package: It would be even nicer if R incorporated something that Duncan Temple Lang suggested last year, namely a new kind of quoting that didn't need escapes in the string. He suggested borrowing triple quotes from Python; I suggested something more like heredocs as in shells or Perl, or like \verb in TeX, in case you wanted triple quotes in your C function. It would be nice to settle on something, so that instead of > code[1] <- " SEXP res;" > code[2] <- " int nprotect = 0, nx, ny, nz, x, y;" > code[3] <- " PROTECT(res = Rf_duplicate(a)); nprotect++;" > code[4] <- " nx = INTEGER(GET_DIM(a))[0];" > code[5] <- " ny = INTEGER(GET_DIM(a))[1];" one could use (for example): code <- ''' SEXP res; int nprotect = 0, nx, ny, nz, x, y; PROTECT(res = Rf_duplicate(a)); nprotect++; nx = INTEGER(GET_DIM(a))[0]; ny = INTEGER(GET_DIM(a))[1]; ''' I think the only way this would affect your package is that you need to be able to handle embedded newlines in strings, but I can't see why you wouldn't be able to do that automatically. Another suggestion: You might want to allow a list of signatures and code chunks to be passed to the compiler, producing a list of functions to evaluate them, all compiled into one DLL. Duncan Murdoch