Hello, I have a setup similar to Rweb ( http://www.math.montana.edu/Rweb/ ): I get R scripts from users and need to execute them in in a safe manner (they are executed automatically, without human inspection). I would like to limit the user's script to reading from STDIN and writing to STDOUT/ERR. Specifically, preventing any kind of interaction with the underlying operating system (files, sockets, system(), etc.). I've found this old thread: http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html But for technical reasons I'd prefer not to setup a chroot jail. I have written a patch that adds a "--sandbox" parameter. When this parameter is used, the user's script can't create any kind of connection object or run "system()". My plan is to run R like this: cat INPUT | R --vanila --slave --sandbox --file SCRIPT.R > OUTPUT Where 'INPUT' is my chosen input and 'SCRIPT.R' is the script submitted by the user. If the script tries to create a conncetion or run a disabled function, an error is printed. This is the patch: http://cancan.cshl.edu/labmembers/gordon/files/R_2.11.0_sandbox.patch So my questions are: 1. Would you be willing to consider this feature for inclusion ? 2. Are there any other 'dangerous' functions I need to intercept ( ".Internal" perhaps ?) All comments and suggestions are welcomed, thanks, -gordon
On 18/05/2010 10:38 PM, Assaf Gordon wrote:> Hello, > > I have a setup similar to Rweb ( http://www.math.montana.edu/Rweb/ ): > I get R scripts from users and need to execute them in in a safe manner (they are executed automatically, without human inspection). > > I would like to limit the user's script to reading from STDIN and writing to STDOUT/ERR. > Specifically, preventing any kind of interaction with the underlying operating system (files, sockets, system(), etc.). > > I've found this old thread: > http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html > But for technical reasons I'd prefer not to setup a chroot jail. > > I have written a patch that adds a "--sandbox" parameter. > When this parameter is used, the user's script can't create any kind of connection object or run "system()". >That sounds too restrictive. R uses connections internally in various places, with no reference to the file system. It also uses them when reading its own files. So if you stop a user from creating connections, you'll somehow need to distinguish between user-created ones and internally necessary ones: not easy. My plan is to run R like this:> cat INPUT | R --vanila --slave --sandbox --file SCRIPT.R > OUTPUT > > Where 'INPUT' is my chosen input and 'SCRIPT.R' is the script submitted by the user. > If the script tries to create a conncetion or run a disabled function, an error is printed. > > This is the patch: > http://cancan.cshl.edu/labmembers/gordon/files/R_2.11.0_sandbox.patch > > So my questions are: > 1. Would you be willing to consider this feature for inclusion ? > 2. Are there any other 'dangerous' functions I need to intercept ( ".Internal" perhaps ?) >.Internal is needed by tons of base functions. So again, you'll need to distinguish where the call is coming from, and that's not easy. Duncan Murdoch> All comments and suggestions are welcomed, > thanks, > -gordon > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
I think you'll find it's a bit more complicated than that. Firstly, R --sandbox is pretty crippled, since as far as I can tell it can't load packages, since package loading uses gzfile(). This would include the 'stats' package. If you can load packages you would need to sanitize all those packages, since they may contain functions that directly talk to the operating system (for example, the 'foreign' package does). Also, most functions called by .C() and many called by .Call() can be made to overwrite memory they don't own, by passing invalid arguments, so the sandbox would only protect you from mistakes by the user and from incompetent attacks, but not from competent attacks. -thomas On Tue, 18 May 2010, Assaf Gordon wrote:> Hello, > > I have a setup similar to Rweb ( http://www.math.montana.edu/Rweb/ ): > I get R scripts from users and need to execute them in in a safe manner (they > are executed automatically, without human inspection). > > I would like to limit the user's script to reading from STDIN and writing to > STDOUT/ERR. > Specifically, preventing any kind of interaction with the underlying > operating system (files, sockets, system(), etc.). > > I've found this old thread: > http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html > But for technical reasons I'd prefer not to setup a chroot jail. > > I have written a patch that adds a "--sandbox" parameter. > When this parameter is used, the user's script can't create any kind of > connection object or run "system()". > > My plan is to run R like this: > > cat INPUT | R --vanila --slave --sandbox --file SCRIPT.R > OUTPUT > > Where 'INPUT' is my chosen input and 'SCRIPT.R' is the script submitted by > the user. > If the script tries to create a conncetion or run a disabled function, an > error is printed. > > This is the patch: > http://cancan.cshl.edu/labmembers/gordon/files/R_2.11.0_sandbox.patch > > So my questions are: > 1. Would you be willing to consider this feature for inclusion ? > 2. Are there any other 'dangerous' functions I need to intercept ( > ".Internal" perhaps ?) > > All comments and suggestions are welcomed, > thanks, > -gordon > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
How about some "computing on the language", something like this: exprs <- parse("SCRIPT.R") invalids <- c(".Internal", ".Primitive") if( any( invalids %in% all.names(exprs) ) ) stop("sandbox check failed") I believe this would prevent evaluating any direct calls to '.Primitive' and '.Internal'. Of course, you could extend the 'invalids' vector to include any names. If you want to consider arguments to calls (i.e. argument to 'file' or 'library') or something more sophisticated, check out the functions in the codetools package, something like this: library(codetools) walkerCall <- function(e, w) { for( ee in as.list(e)) { if(!missing(ee)) { if(is.call(ee)) { #stop .Internal calls if(ee[1] == call(".Internal")) stop("invalid \'.Internal()\' call") #restrict file to STDIN if(ee[1] == call("file")) { mc <- match.call(file, ee) if(mc[[2]] != "stdin") stop("\'file()\' only valid with \'description=\"stdin\"\'") } } walkCode(ee, w) } } } walker <- makeCodeWalker(call=walkerCall, leaf=function(e,w){}) exprs <- parse("SCRIPT.R") for( expr in exprs ) walkCode(expr,walker) I'm a little surprised this there isn't a 'sandbox' package or something similar to this. A reverse depends check on the codetools package indicates there is not. However, I believe there is some demand for it. Matt Shotwell http://biostatmatt.com On Tue, 2010-05-18 at 22:38 -0400, Assaf Gordon wrote:> Hello, > > I have a setup similar to Rweb ( http://www.math.montana.edu/Rweb/ ): > I get R scripts from users and need to execute them in in a safe manner (they are executed automatically, without human inspection). > > I would like to limit the user's script to reading from STDIN and writing to STDOUT/ERR. > Specifically, preventing any kind of interaction with the underlying operating system (files, sockets, system(), etc.). > > I've found this old thread: > http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html > But for technical reasons I'd prefer not to setup a chroot jail. > > I have written a patch that adds a "--sandbox" parameter. > When this parameter is used, the user's script can't create any kind of connection object or run "system()". > > My plan is to run R like this: > > cat INPUT | R --vanila --slave --sandbox --file SCRIPT.R > OUTPUT > > Where 'INPUT' is my chosen input and 'SCRIPT.R' is the script submitted by the user. > If the script tries to create a conncetion or run a disabled function, an error is printed. > > This is the patch: > http://cancan.cshl.edu/labmembers/gordon/files/R_2.11.0_sandbox.patch > > So my questions are: > 1. Would you be willing to consider this feature for inclusion ? > 2. Are there any other 'dangerous' functions I need to intercept ( ".Internal" perhaps ?) > > All comments and suggestions are welcomed, > thanks, > -gordon > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
On Tue, May 18, 2010 at 7:38 PM, Assaf Gordon <assafgordon at gmail.com> wrote:> I've found this old thread: > http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html > But for technical reasons I'd prefer not to setup a chroot jail. >I would also point out that the state of the art in the operating system community has moved on significantly since 1982 when chroot was added. BSD Jails, Solaris Zones/Containers, SELinux, etc. all provide much more control over the system calls, network connections, and file and device access granted to applications in different jails/zones. These operating system capabilities solve exactly some of the problems you are trying to solve by painstakingly modifying R, but in a more secure and configurable manner. - Murray