Advice on parsing / overriding function calls

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20070816/f47c10b4/attachment.pl
What are you trying to defend against?  A serious attacker could still
use rm/assign/get/eval/... to circumvent your replaced functions.  I
think it would be very difficult (if not impossible) to prevent this
from happening), especially if the user can load packages.

Hadley
Hi,

I am trying to tighten file I/O security on a process that passes a
user-supplied script to R CMD Batch.  Broadly speaking, I'd like to restrict
I/O to a designated path on the file system. Right now, I'm trying to
address this in the R environment by forcing the script to use modified
versions of scan, read.table, sys.load.image, etc.

I can run a replace string on the user-supplied script so that, for example,
"scan(" is replaced by "safe.scan("

e.g.

SafePath <- function(file)
{fp<-strsplit(file,"/");paste("safepath",fp[[1]][length(fp[[1]])],sep="/")}
SafePath("/etc/passwd")
[1] "safepath/passwd"

 Safe.scan <- function(file, ...) scan(SafePath(file),...)
Safe.scan("/etc/passwd",what="",sep="\n")
Error in file(file, "r") : unable to open connection
In addition: Warning message:
cannot open file 'safepath/passwd', reason 'No such file or directory'

I'd appreciate any critique of this approach.  Is there something more
effective or elegant?

Regards,
Mike

        [[alternative HTML version deleted]]

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

http://had.co.nz/
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20070816/d300b4d2/attachment.pl
Well, I think there are some serious use e.g. offering a web server
for script uploaded then downloading the Rout result back...

The issue is more about whether he wants to limit *all* file system 
access or just limiting to certain areas. For the former,
I would set up a chroot jail and run R from within; for the latter,
I would probably do something with LD_LIBRARY_PRELOAD to override
all the file system accessing functions in libc directly, really.
That would fix the problem with system(rm) and some such, I think,
because if your entire R process and any sub-process R launches has no 
access to the genuine libc fwrite/fread/etc functions you cannot do
any demage, right?
Both are tricky and take time to do (the chroot jail a bit easier, 
actually...), but quite do-able.

It depends on (1) how paranoid you are, (2) how much trouble you want to 
have for yourself to achieve those restrictions...
What are you trying to defend against?  A serious attacker could still
use rm/assign/get/eval/... to circumvent your replaced functions.  I
think it would be very difficult (if not impossible) to prevent this
from happening), especially if the user can load packages.

Hadley

On 8/16/07, Michael Cassin <michael at cassin.name> wrote:
Hi,

I am trying to tighten file I/O security on a process that passes a
user-supplied script to R CMD Batch.  Broadly speaking, I'd like to restrict
I/O to a designated path on the file system. Right now, I'm trying to
address this in the R environment by forcing the script to use modified
versions of scan, read.table, sys.load.image, etc.

I can run a replace string on the user-supplied script so that, for example,
"scan(" is replaced by "safe.scan("

e.g.

SafePath <- function(file)
{fp<-strsplit(file,"/");paste("safepath",fp[[1]][length(fp[[1]])],sep="/")}
SafePath("/etc/passwd")
[1] "safepath/passwd"

 Safe.scan <- function(file, ...) scan(SafePath(file),...)
Safe.scan("/etc/passwd",what="",sep="\n")
Error in file(file, "r") : unable to open connection
In addition: Warning message:
cannot open file 'safepath/passwd', reason 'No such file or directory'

I'd appreciate any critique of this approach.  Is there something more
effective or elegant?

Regards,
Mike

        [[alternative HTML version deleted]]

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

The issue is more about whether he wants to limit *all* file system
access or just limiting to certain areas. For the former,
I would set up a chroot jail and run R from within; for the latter,
I would probably do something with LD_LIBRARY_PRELOAD to override
all the file system accessing functions in libc directly, really.
That would fix the problem with system(rm) and some such, I think,
because if your entire R process and any sub-process R launches has no
access to the genuine libc fwrite/fread/etc functions you cannot do
any demage, right?
Both are tricky and take time to do (the chroot jail a bit easier,
actually...), but quite do-able.
a sneaky trick:

for each compute session, automate setting up a zone ("solaris 
containers") on a solaris 10+ box.  if you have a 
preinstalled/preconfigured zone template, snapshotted with zfs, you can 
roll out a new compute zone in literally seconds.  you can quota it, limit 
the amount of CPU it gets, etc.  really not very difficult at all to set 
up.  sun's tools are *great* for this nowadays.

this is substantially safer than chroot() or LD_PRELOAD tricks, and lets 
you do this stuff without having to invent the wheel.

it also reduces overhead to the point where you really *can* set up a 
naked compute (well, with R in it...) environment for every compute 
session getting instantiated.  in way, way, way less time than it takes 
for the computations to actually run.

if someone does system(rm) in a container... who cares?  they just trashed 
their own session, and nothing else.  just blow the trashed ones away 
periodically.

--e
Thinking along these lines, we actually have a mechanism for  
replacing the system call (it's used by the Mac GUI to allow root  
calls) and one could think of expanding this to all critical  
operations. Clearly, there are issues (speed for example), but it  
would be nice to have a 'fortified' version of R that allows turing  
on restrictions. I don't think it's easy, but given the rising demand  
(at least in my perception), it would be interesting to see how far  
we can get.

Re filtering strings in commands - I don't think this will work,  
because you can compute on the language, so you can construct  
arbitrary calls without using the names in verbatim, so it is  
possible to circumvent such filters fairly easily.

Cheers,
Simon

Well, I think there are some serious use e.g. offering a web server
for script uploaded then downloading the Rout result back...

The issue is more about whether he wants to limit *all* file system
access or just limiting to certain areas. For the former,
I would set up a chroot jail and run R from within; for the latter,
I would probably do something with LD_LIBRARY_PRELOAD to override
all the file system accessing functions in libc directly, really.
That would fix the problem with system(rm) and some such, I think,
because if your entire R process and any sub-process R launches has no
access to the genuine libc fwrite/fread/etc functions you cannot do
any demage, right?
Both are tricky and take time to do (the chroot jail a bit easier,
actually...), but quite do-able.

It depends on (1) how paranoid you are, (2) how much trouble you  
want to
have for yourself to achieve those restrictions...

hadley wickham wrote:
What are you trying to defend against?  A serious attacker could  
still
use rm/assign/get/eval/... to circumvent your replaced functions.  I
think it would be very difficult (if not impossible) to prevent this
from happening), especially if the user can load packages.

Hadley

On 8/16/07, Michael Cassin <michael at cassin.name> wrote:
Hi,

I am trying to tighten file I/O security on a process that passes a
user-supplied script to R CMD Batch.  Broadly speaking, I'd like  
to restrict
I/O to a designated path on the file system. Right now, I'm  
trying to
address this in the R environment by forcing the script to use  
modified
versions of scan, read.table, sys.load.image, etc.

I can run a replace string on the user-supplied script so that,  
for example,
"scan(" is replaced by "safe.scan("

e.g.

SafePath <- function(file)
{fp<-strsplit(file,"/");paste("safepath",fp[[1]][length(fp 
[[1]])],sep="/")}
SafePath("/etc/passwd")
[1] "safepath/passwd"

 Safe.scan <- function(file, ...) scan(SafePath(file),...)
Safe.scan("/etc/passwd",what="",sep="\n")
Error in file(file, "r") : unable to open connection
In addition: Warning message:
cannot open file 'safepath/passwd', reason 'No such file or  
directory'

I'd appreciate any critique of this approach.  Is there something  
more
effective or elegant?

Regards,
Mike

        [[alternative HTML version deleted]]

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Thinking along these lines, we actually have a mechanism for
replacing the system call (it's used by the Mac GUI to allow root
calls) and one could think of expanding this to all critical
operations. Clearly, there are issues (speed for example), but it
would be nice to have a 'fortified' version of R that allows turing
on restrictions. I don't think it's easy, but given the rising demand
(at least in my perception), it would be interesting to see how far
we can get.

Re filtering strings in commands - I don't think this will work,
because you can compute on the language, so you can construct
arbitrary calls without using the names in verbatim, so it is
possible to circumvent such filters fairly easily.
Exactly.  All I would need is access to a file() connection, and I could 
easily do that in such a way that 'file' never appeared in the script.
And I've thought of half a dozen backdoors that have not been mentioned in 
this thread.

I am not sure there is really much point in trying to fortify R, when 
that's the OS's job and it may well be better to run R in a suitable 
sandbox.  Certainly I think that is the solution for web services.

One area where it may be necessary is embedded applications.  Certainly if 
R is embedded into the same process (which is how R as an shlib or DLL is 
usually used) then you may want the main application to have privileges 
you do not give to the embedded R.  But using a separate process (e.g. via 
Rserve) may be more secure.
Cheers,
Simon

On Aug 16, 2007, at 9:23 AM, Hin-Tak Leung wrote:

Well, I think there are some serious use e.g. offering a web server
for script uploaded then downloading the Rout result back...

The issue is more about whether he wants to limit *all* file system
access or just limiting to certain areas. For the former,
I would set up a chroot jail and run R from within; for the latter,
I would probably do something with LD_LIBRARY_PRELOAD to override
all the file system accessing functions in libc directly, really.
That would fix the problem with system(rm) and some such, I think,
because if your entire R process and any sub-process R launches has no
access to the genuine libc fwrite/fread/etc functions you cannot do
any demage, right?
Both are tricky and take time to do (the chroot jail a bit easier,
actually...), but quite do-able.

It depends on (1) how paranoid you are, (2) how much trouble you
want to
have for yourself to achieve those restrictions...

hadley wickham wrote:
What are you trying to defend against?  A serious attacker could
still
use rm/assign/get/eval/... to circumvent your replaced functions.  I
think it would be very difficult (if not impossible) to prevent this
from happening), especially if the user can load packages.

Hadley

On 8/16/07, Michael Cassin <michael at cassin.name> wrote:
Hi,

I am trying to tighten file I/O security on a process that passes a
user-supplied script to R CMD Batch.  Broadly speaking, I'd like
to restrict
I/O to a designated path on the file system. Right now, I'm
trying to
address this in the R environment by forcing the script to use
modified
versions of scan, read.table, sys.load.image, etc.

I can run a replace string on the user-supplied script so that,
for example,
"scan(" is replaced by "safe.scan("

e.g.

SafePath <- function(file)
{fp<-strsplit(file,"/");paste("safepath",fp[[1]][length(fp
[[1]])],sep="/")}
SafePath("/etc/passwd")
[1] "safepath/passwd"

 Safe.scan <- function(file, ...) scan(SafePath(file),...)
Safe.scan("/etc/passwd",what="",sep="\n")
Error in file(file, "r") : unable to open connection
In addition: Warning message:
cannot open file 'safepath/passwd', reason 'No such file or
directory'

I'd appreciate any critique of this approach.  Is there something
more
effective or elegant?

Regards,
Mike

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20070816/c5f34112/attachment.pl