Scanning a R script for potentially insidious commands

12 messages · Etienne Sévin, R. Michael Weylandt, Jan T. Kim +4 more

Original

1

12

Tue, Dec 18, 2012 4:48 AM #

Hey all,

We are building a R connector for our web application.
The user can upload a script so it can be executed on the server.

Is there a way to scan the script for insidious commands (writing on the
disk for example) and purge them out?
I guess a simple search is not enough so is there a way to analyse the
pseudo code?

Best,

Etienne

R. Michael Weylandt

Wed, Dec 19, 2012 3:28 AM #

On Dec 18, 2012, at 12:48 PM, Etienne S?vin <e.sevin at epiconcept.fr> wrote:

Completely, not that I know of: but grepping for system() and eval() should catch a majority of red flags. 

Michael

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Joris Meys

Wed, Dec 19, 2012 3:39 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20121219/9e56c4e5/attachment.pl>

Jan T. Kim

Wed, Dec 19, 2012 4:02 AM #

On Wed, Dec 19, 2012 at 12:39:21PM +0100, Joris Meys wrote:

just out of curiosity, how do you disable these functions? Is there
a way to "blacklist" functions as such in R, regardless of what name
is used to call them? Simple string pattern matching (as I understand
Michael's "grepping" suggestion below) can be circumvented by using
the get function, as in

    s <- paste(letters[i], collapse = "");
    f <- get(s);
    f("insidiouscommand");

where i contains suitable indices to produce "system". So the system
function needs disabling as such, as there are innumerable ways to
code up its invocation.

Personally, I'd suggest to consider long and hard whether executing
user submitted R code is really necessary, and if that's the case, my
inclination would be to run that on a virtual machine and sandbox that
as much as you can.

Best regards, Jan

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

+- Jan T. Kim -------------------------------------------------------+
 |             email: jttkim at gmail.com                                |
 |             WWW:   http://www.jtkim.dreamhosters.com/              |
 *-----=<  hierarchical systems are for files, not for humans  >=-----*

Joris Meys

Wed, Dec 19, 2012 4:38 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20121219/d0b4447b/attachment.pl>

Dirk Eddelbuettel

Wed, Dec 19, 2012 5:33 AM #

Jeroen has a package devoted to the sandboxing approach in conjunction with
the system-level AppArmor facility:  RAppArmor.  See

  http://cran.r-project.org/web/packages/RAppArmor/index.html

and more details at

  https://github.com/jeroenooms/RAppArmor#readme

Dirk

Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com

Wed, Dec 19, 2012 7:46 AM #

On Dec 19, 2012, at 7:38 AM, Joris Meys wrote:

Creating a *specific* user is not enough as instances can affect each other (i.e. any job running on such system is in jeopardy - you never know if your results were modified by a malicious process). Rserve allows separate uid/gid per connection so that's one way to tackle that - it also makes the separation easier. As Dirk pointed out on Linux there is AppArmor and sandbox on OS X if you want to limit what the user can do.


And, indeed, trying to filter commands is not the right way as it's trivial to circumvent - anyone with access to R has the capability to run arbitrary native code with .C/.Call and you can't disable that without making R unusable.

Cheers,
Simon

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Wed, Dec 19, 2012 8:21 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20121219/a7ccf9ff/attachment.pl>

Wed, Dec 19, 2012 9:12 AM #

On Dec 19, 2012, at 11:21 AM, Gabriel Becker wrote:

It is a good example of false security. For the reasons mentioned before this doesn't work and can be circumvented:

_developer:*:204:
_locationd:*:205:
_carddav:*:206:
_detachedsig:*:207:
_trustevaluationagent:*:208:
_odchpass:*:209:
_timezone:*:210:
_lda:*:211:
_cvms:*:212:
_usbmuxd:*:213:
[1] 0

The problem is that you can try to plug holes (and sandboxR is trying hard to plug a lot of them), but there will always be new ones. It's simply the wrong approach IMHO.

Cheers,
Simon

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Wed, Dec 19, 2012 10:09 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20121219/4122aeac/attachment.pl>

Wed, Dec 19, 2012 10:10 AM #

On Dec 19, 2012, at 1:09 PM, Gabriel Becker wrote:

No, it's pure R code, I just didn't want to put the exploit on the list ...

Cheers,
S

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Wed, Dec 19, 2012 10:47 AM #

On Dec 19, 2012, at 1:10 PM, Simon Urbanek wrote:

I just found another exploit that is harder to catch than the fist one. Both work by giving you unrestricted access directly to any function in any namespace or environment. The first one consists of 15 chars + the function name, the second takes 33 characters + function name (both work on base R, no extra packages needed). They use two entirely separate aspects of R to do it. I don't want to make this an exploit contest, I'm just trying to make the point that you cannot try to secure R by any kind of filtering or pre-processing, because the language is too flexible to make you feel secure. I would strongly discourage anyone from providing an open service that relies on filtering. There is a fairly high probability that the will be a way around it. That's why all realistic approaches work on the back-end.

Cheers,
Simon

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel