An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20070816/f47c10b4/attachment.pl
Advice on parsing / overriding function calls
8 messages · Hadley Wickham, Michael Cassin, Hin-Tak Leung +3 more
What are you trying to defend against? A serious attacker could still use rm/assign/get/eval/... to circumvent your replaced functions. I think it would be very difficult (if not impossible) to prevent this from happening), especially if the user can load packages. Hadley
On 8/16/07, Michael Cassin <michael at cassin.name> wrote:
Hi,
I am trying to tighten file I/O security on a process that passes a
user-supplied script to R CMD Batch. Broadly speaking, I'd like to restrict
I/O to a designated path on the file system. Right now, I'm trying to
address this in the R environment by forcing the script to use modified
versions of scan, read.table, sys.load.image, etc.
I can run a replace string on the user-supplied script so that, for example,
"scan(" is replaced by "safe.scan("
e.g.
SafePath <- function(file)
{fp<-strsplit(file,"/");paste("safepath",fp[[1]][length(fp[[1]])],sep="/")}
SafePath("/etc/passwd")
[1] "safepath/passwd"
Safe.scan <- function(file, ...) scan(SafePath(file),...)
Safe.scan("/etc/passwd",what="",sep="\n")
Error in file(file, "r") : unable to open connection
In addition: Warning message:
cannot open file 'safepath/passwd', reason 'No such file or directory'
I'd appreciate any critique of this approach. Is there something more
effective or elegant?
Regards,
Mike
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20070816/d300b4d2/attachment.pl
Well, I think there are some serious use e.g. offering a web server for script uploaded then downloading the Rout result back... The issue is more about whether he wants to limit *all* file system access or just limiting to certain areas. For the former, I would set up a chroot jail and run R from within; for the latter, I would probably do something with LD_LIBRARY_PRELOAD to override all the file system accessing functions in libc directly, really. That would fix the problem with system(rm) and some such, I think, because if your entire R process and any sub-process R launches has no access to the genuine libc fwrite/fread/etc functions you cannot do any demage, right? Both are tricky and take time to do (the chroot jail a bit easier, actually...), but quite do-able. It depends on (1) how paranoid you are, (2) how much trouble you want to have for yourself to achieve those restrictions...
hadley wickham wrote:
What are you trying to defend against? A serious attacker could still use rm/assign/get/eval/... to circumvent your replaced functions. I think it would be very difficult (if not impossible) to prevent this from happening), especially if the user can load packages. Hadley On 8/16/07, Michael Cassin <michael at cassin.name> wrote:
Hi,
I am trying to tighten file I/O security on a process that passes a
user-supplied script to R CMD Batch. Broadly speaking, I'd like to restrict
I/O to a designated path on the file system. Right now, I'm trying to
address this in the R environment by forcing the script to use modified
versions of scan, read.table, sys.load.image, etc.
I can run a replace string on the user-supplied script so that, for example,
"scan(" is replaced by "safe.scan("
e.g.
SafePath <- function(file)
{fp<-strsplit(file,"/");paste("safepath",fp[[1]][length(fp[[1]])],sep="/")}
SafePath("/etc/passwd")
[1] "safepath/passwd"
Safe.scan <- function(file, ...) scan(SafePath(file),...)
Safe.scan("/etc/passwd",what="",sep="\n")
Error in file(file, "r") : unable to open connection
In addition: Warning message:
cannot open file 'safepath/passwd', reason 'No such file or directory'
I'd appreciate any critique of this approach. Is there something more
effective or elegant?
Regards,
Mike
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
The issue is more about whether he wants to limit *all* file system access or just limiting to certain areas. For the former, I would set up a chroot jail and run R from within; for the latter, I would probably do something with LD_LIBRARY_PRELOAD to override all the file system accessing functions in libc directly, really. That would fix the problem with system(rm) and some such, I think, because if your entire R process and any sub-process R launches has no access to the genuine libc fwrite/fread/etc functions you cannot do any demage, right? Both are tricky and take time to do (the chroot jail a bit easier, actually...), but quite do-able.
a sneaky trick:
for each compute session, automate setting up a zone ("solaris
containers") on a solaris 10+ box. if you have a
preinstalled/preconfigured zone template, snapshotted with zfs, you can
roll out a new compute zone in literally seconds. you can quota it, limit
the amount of CPU it gets, etc. really not very difficult at all to set
up. sun's tools are *great* for this nowadays.
this is substantially safer than chroot() or LD_PRELOAD tricks, and lets
you do this stuff without having to invent the wheel.
it also reduces overhead to the point where you really *can* set up a
naked compute (well, with R in it...) environment for every compute
session getting instantiated. in way, way, way less time than it takes
for the computations to actually run.
if someone does system(rm) in a container... who cares? they just trashed
their own session, and nothing else. just blow the trashed ones away
periodically.
--e
Thinking along these lines, we actually have a mechanism for replacing the system call (it's used by the Mac GUI to allow root calls) and one could think of expanding this to all critical operations. Clearly, there are issues (speed for example), but it would be nice to have a 'fortified' version of R that allows turing on restrictions. I don't think it's easy, but given the rising demand (at least in my perception), it would be interesting to see how far we can get. Re filtering strings in commands - I don't think this will work, because you can compute on the language, so you can construct arbitrary calls without using the names in verbatim, so it is possible to circumvent such filters fairly easily. Cheers, Simon
On Aug 16, 2007, at 9:23 AM, Hin-Tak Leung wrote:
Well, I think there are some serious use e.g. offering a web server for script uploaded then downloading the Rout result back... The issue is more about whether he wants to limit *all* file system access or just limiting to certain areas. For the former, I would set up a chroot jail and run R from within; for the latter, I would probably do something with LD_LIBRARY_PRELOAD to override all the file system accessing functions in libc directly, really. That would fix the problem with system(rm) and some such, I think, because if your entire R process and any sub-process R launches has no access to the genuine libc fwrite/fread/etc functions you cannot do any demage, right? Both are tricky and take time to do (the chroot jail a bit easier, actually...), but quite do-able. It depends on (1) how paranoid you are, (2) how much trouble you want to have for yourself to achieve those restrictions... hadley wickham wrote:
What are you trying to defend against? A serious attacker could still use rm/assign/get/eval/... to circumvent your replaced functions. I think it would be very difficult (if not impossible) to prevent this from happening), especially if the user can load packages. Hadley On 8/16/07, Michael Cassin <michael at cassin.name> wrote:
Hi,
I am trying to tighten file I/O security on a process that passes a
user-supplied script to R CMD Batch. Broadly speaking, I'd like
to restrict
I/O to a designated path on the file system. Right now, I'm
trying to
address this in the R environment by forcing the script to use
modified
versions of scan, read.table, sys.load.image, etc.
I can run a replace string on the user-supplied script so that,
for example,
"scan(" is replaced by "safe.scan("
e.g.
SafePath <- function(file)
{fp<-strsplit(file,"/");paste("safepath",fp[[1]][length(fp
[[1]])],sep="/")}
SafePath("/etc/passwd")
[1] "safepath/passwd"
Safe.scan <- function(file, ...) scan(SafePath(file),...)
Safe.scan("/etc/passwd",what="",sep="\n")
Error in file(file, "r") : unable to open connection
In addition: Warning message:
cannot open file 'safepath/passwd', reason 'No such file or
directory'
I'd appreciate any critique of this approach. Is there something
more
effective or elegant?
Regards,
Mike
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On Thu, 16 Aug 2007, Simon Urbanek wrote:
Thinking along these lines, we actually have a mechanism for replacing the system call (it's used by the Mac GUI to allow root calls) and one could think of expanding this to all critical operations. Clearly, there are issues (speed for example), but it would be nice to have a 'fortified' version of R that allows turing on restrictions. I don't think it's easy, but given the rising demand (at least in my perception), it would be interesting to see how far we can get. Re filtering strings in commands - I don't think this will work, because you can compute on the language, so you can construct arbitrary calls without using the names in verbatim, so it is possible to circumvent such filters fairly easily.
Exactly. All I would need is access to a file() connection, and I could easily do that in such a way that 'file' never appeared in the script. And I've thought of half a dozen backdoors that have not been mentioned in this thread. I am not sure there is really much point in trying to fortify R, when that's the OS's job and it may well be better to run R in a suitable sandbox. Certainly I think that is the solution for web services. One area where it may be necessary is embedded applications. Certainly if R is embedded into the same process (which is how R as an shlib or DLL is usually used) then you may want the main application to have privileges you do not give to the embedded R. But using a separate process (e.g. via Rserve) may be more secure.
Cheers, Simon On Aug 16, 2007, at 9:23 AM, Hin-Tak Leung wrote:
Well, I think there are some serious use e.g. offering a web server for script uploaded then downloading the Rout result back... The issue is more about whether he wants to limit *all* file system access or just limiting to certain areas. For the former, I would set up a chroot jail and run R from within; for the latter, I would probably do something with LD_LIBRARY_PRELOAD to override all the file system accessing functions in libc directly, really. That would fix the problem with system(rm) and some such, I think, because if your entire R process and any sub-process R launches has no access to the genuine libc fwrite/fread/etc functions you cannot do any demage, right? Both are tricky and take time to do (the chroot jail a bit easier, actually...), but quite do-able. It depends on (1) how paranoid you are, (2) how much trouble you want to have for yourself to achieve those restrictions... hadley wickham wrote:
What are you trying to defend against? A serious attacker could still use rm/assign/get/eval/... to circumvent your replaced functions. I think it would be very difficult (if not impossible) to prevent this from happening), especially if the user can load packages. Hadley On 8/16/07, Michael Cassin <michael at cassin.name> wrote:
Hi,
I am trying to tighten file I/O security on a process that passes a
user-supplied script to R CMD Batch. Broadly speaking, I'd like
to restrict
I/O to a designated path on the file system. Right now, I'm
trying to
address this in the R environment by forcing the script to use
modified
versions of scan, read.table, sys.load.image, etc.
I can run a replace string on the user-supplied script so that,
for example,
"scan(" is replaced by "safe.scan("
e.g.
SafePath <- function(file)
{fp<-strsplit(file,"/");paste("safepath",fp[[1]][length(fp
[[1]])],sep="/")}
SafePath("/etc/passwd")
[1] "safepath/passwd"
Safe.scan <- function(file, ...) scan(SafePath(file),...)
Safe.scan("/etc/passwd",what="",sep="\n")
Error in file(file, "r") : unable to open connection In addition: Warning message: cannot open file 'safepath/passwd', reason 'No such file or directory' I'd appreciate any critique of this approach. Is there something more effective or elegant? Regards, Mike
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20070816/c5f34112/attachment.pl