Jeffrey Horner
2006-Oct-09 21:40 UTC
[Rd] Discussion starter for package level Connection API
Thought I'd try and start a discussion. Feel free to jump in. I guess R needs to strike the right balance between opening up the internals to package writers and not allowing them to do bad things. My first attempt at cracking this nut is to just memcpy() the Rconnection and not allow access to the private stuff: /* Alternative to allowing C code access to connection API. */ Rconnection R_GetConnection(Rconnection ucon, int idx){ Rconnection rcon; /* Valid connection? */ if ((rcon = getConnection(idx)) == NULL) return NULL; memcpy(ucon,rcon,sizeof(struct Rconn)); /* Don't reveal private data */ ucon->private = NULL; return ucon; } This would take an user allocated Rconnection and fill out all structure members but deny access to the private data. It also presumes that the full Rconnection structure is available in an R_ext/ header file. This has the advantage of getting access to all the function pointers so that data can be pushed and pulled through the connection, without any knowledge of what's in the private area. The first problem is with the class and description members. What to do? As it is, the user could do bad things like rename the class or description. If the function copied the strings, then the user would have to deallocate them as well. Then there's the PushBack members, to which the user would have full access. Looks like they're only used in text connections. Would these be better off placed in the private member structure? Also, the user has the capability to call close on the connection without updating the original isopen member. Here's a rather restrictive approach whereby the user must know the integer index of the connection. Each function is a wrapper around the related Rconnection member. int R_VfprintfConnection(int idx, const char *format, va_list ap){ Rconnection con = getConnection(idx); if (!con) return -1; /* just like fprintf(3)? */ if(!con->isopen) error(_("connection is not open")); if(!con->canwrite) error(_("cannot write to this connection")); return con->vfprintf(con,format,ap); } int R_FgetcConnection(int idx){ Rconnection con = getConnection(idx); if (!con) return EOF; /* just like fgetc(3)? */ if(!con->isopen) error(_("connection is not open")); if(!con->canread) error(_("cannot read from this connection")); return con->fgetc(con); } double R_SeekConnection(int idx, double where, int origin, int rw){ Rconnection con = getConnection(idx); if (!con) return -1; /* just like fseek(3)? */ if(!con->isopen) error(_("connection is not open")); if(!con->canseek) error(_("cannot seek on this connection")); return con->seek(con,where,origin,rw); } void R_TruncateConnection(int idx){ Rconnection con = getConnection(idx); if (con) con->truncate(con); } int R_FlushConnection(int idx){ Rconnection con = getConnection(idx); if (!con) return EOF; /* like fflush(3) */ return con->fflush(con); } size_t R_ReadConnection(int idx, void *buf, size_t size, size_t n){ Rconnection con = getConnection(idx); if (!con) return 0; if(!con->isopen) error(_("connection is not open")); if(!con->canread) error(_("cannot read from this connection")); return con->read(buf,size,n,con); } size_t R_WriteConnection(int idx, void *buf, size_t size, size_t n) { Rconnection con = getConnection(idx); if (!con) return -1; /* just like write(2)? */ if(!con->isopen) error(_("connection is not open")); if(!con->canwrite) error(_("cannot write to this connection")); return con->write(buf, size, n, con); } Thus, the user has no access to the Rconnection at all. Only question from me is whether there's too much overhead in calling getConnection(), especially when calling R_FgetcConnection() in a loop. Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner
Jeffrey Horner
2006-Oct-10 04:46 UTC
[Rd] Discussion starter for package level Connection API
Here's a function to create new connections. It returns and SEXP that can be returned to R code, but before that, the user can set up all the function pointer members via passing in the address of an Rconnection. Here's how you'd call it: SEXP newcon; Rconnection conptr; newcon = R_NewConnection("newcon","newdesc","w",&conptr); conptr->vfprintf = apachecon_vfprintf; conptr->write = apachecon_write; ... conptr->private = apachecon_init(); /* or whathaveyou */ return newcon; And if you wanted to know the index, all you have to do is get it from asInteger(newcon). Here's the code for R_NewConnection; SEXP R_NewConnection(char *class, char *description, char *mode, Rconnection *con) { SEXP sclass, ans; Rconnection new = NULL; int ncon; /* Get index before we allocate memory */ ncon = NextConnection(); new = (Rconnection) malloc(sizeof(struct Rconn)); if(!new) error(_("allocation of new connection failed")); new->class = (char *) malloc(strlen(class) + 1); if(!new->class) { free(new); error(_("allocation of terminal connection failed")); } strcpy(new->class, class); new->description = (char *) malloc(strlen(description) + 1); if(!new->description) { free(new->class); free(new); error(_("allocation of terminal connection failed")); } init_con(new, description, mode); new->isopen = TRUE; new->canread = (strcmp(mode, "r") == 0); new->canwrite = (strcmp(mode, "w") == 0); new->destroy = &null_close; new->private = NULL; /* Add to Connections, and also * allow users to assign function * pointer members */ *con = Connections[ncon] = new; /* Create something to pass to R code */ PROTECT(ans = allocVector(INTSXP, 1)); INTEGER(ans)[0] = ncon; PROTECT(sclass = allocVector(STRSXP, 2)); SET_STRING_ELT(sclass, 0, mkChar(class)); SET_STRING_ELT(sclass, 1, mkChar("connection")); classgets(ans, sclass); UNPROTECT(2); return ans; } Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner