Hey all, We are building a R connector for our web application. The user can upload a script so it can be executed on the server. Is there a way to scan the script for insidious commands (writing on the disk for example) and purge them out? I guess a simple search is not enough so is there a way to analyse the pseudo code? Best, Etienne
Scanning a R script for potentially insidious commands
12 messages · Etienne Sévin, R. Michael Weylandt, Jan T. Kim +4 more
On Dec 18, 2012, at 12:48 PM, Etienne S?vin <e.sevin at epiconcept.fr> wrote:
Hey all, We are building a R connector for our web application. The user can upload a script so it can be executed on the server. Is there a way to scan the script for insidious commands (writing on the disk for example) and purge them out?
Completely, not that I know of: but grepping for system() and eval() should catch a majority of red flags. Michael
I guess a simple search is not enough so is there a way to analyse the pseudo code? Best, Etienne
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20121219/9e56c4e5/attachment.pl>
On Wed, Dec 19, 2012 at 12:39:21PM +0100, Joris Meys wrote:
The safest way to prevent attacks using an R connector, is managing the permissions for the application on your own server. We do that with the RStudio Server application we have running. You have to take into account that R allows for many interactions with the system. Also file(), dir(), unlink() and all sys. functions have the potential to screen and possibly alter your system. Not only system() and eval() pose a security problem...
just out of curiosity, how do you disable these functions? Is there
a way to "blacklist" functions as such in R, regardless of what name
is used to call them? Simple string pattern matching (as I understand
Michael's "grepping" suggestion below) can be circumvented by using
the get function, as in
s <- paste(letters[i], collapse = "");
f <- get(s);
f("insidiouscommand");
where i contains suitable indices to produce "system". So the system
function needs disabling as such, as there are innumerable ways to
code up its invocation.
How to do this exactly, depends very much on both the server and OS settings and the specific R connector you use/build. But don't count on R alone to provide safety.
Personally, I'd suggest to consider long and hard whether executing user submitted R code is really necessary, and if that's the case, my inclination would be to run that on a virtual machine and sandbox that as much as you can. Best regards, Jan
Cheers Joris On Wed, Dec 19, 2012 at 12:28 PM, Michael Weylandt < michael.weylandt at gmail.com> wrote:
On Dec 18, 2012, at 12:48 PM, Etienne S?vin <e.sevin at epiconcept.fr> wrote:
Hey all, We are building a R connector for our web application. The user can upload a script so it can be executed on the server. Is there a way to scan the script for insidious commands (writing on the disk for example) and purge them out?
Completely, not that I know of: but grepping for system() and eval() should catch a majority of red flags. Michael
I guess a simple search is not enough so is there a way to analyse the pseudo code? Best, Etienne
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 9 264 59 87 Joris.Meys at Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
+- Jan T. Kim -------------------------------------------------------+ | email: jttkim at gmail.com | | WWW: http://www.jtkim.dreamhosters.com/ | *-----=< hierarchical systems are for files, not for humans >=-----*
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20121219/d0b4447b/attachment.pl>
Jeroen has a package devoted to the sandboxing approach in conjunction with the system-level AppArmor facility: RAppArmor. See http://cran.r-project.org/web/packages/RAppArmor/index.html and more details at https://github.com/jeroenooms/RAppArmor#readme Dirk
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On Dec 19, 2012, at 7:38 AM, Joris Meys wrote:
On Wed, Dec 19, 2012 at 1:02 PM, Jan T Kim <jttkim at googlemail.com> wrote:
On Wed, Dec 19, 2012 at 12:39:21PM +0100, Joris Meys wrote:
The safest way to prevent attacks using an R connector, is managing the permissions for the application on your own server. We do that with the RStudio Server application we have running. You have to take into account that R allows for many interactions with the system. Also file(), dir(), unlink() and all sys. functions have the potential to screen and possibly alter your system. Not only system() and eval() pose a security
problem... just out of curiosity, how do you disable these functions?
You got me wrong: We don't disable these functions, we just don't give the R instance that's executing the file any permissions on the system. So trying to run any function that wants to access the system only results in error messages. I believe we did that by creating a specific user account and linked that to the R application behind the interface. But sandboxing (as you mentioned) is just as good.
Creating a *specific* user is not enough as instances can affect each other (i.e. any job running on such system is in jeopardy - you never know if your results were modified by a malicious process). Rserve allows separate uid/gid per connection so that's one way to tackle that - it also makes the separation easier. As Dirk pointed out on Linux there is AppArmor and sandbox on OS X if you want to limit what the user can do. And, indeed, trying to filter commands is not the right way as it's trivial to circumvent - anyone with access to R has the capability to run arbitrary native code with .C/.Call and you can't disable that without making R unusable. Cheers, Simon
-- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 9 264 59 87 Joris.Meys at Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20121219/a7ccf9ff/attachment.pl>
On Dec 19, 2012, at 11:21 AM, Gabriel Becker wrote:
See also: https://github.com/Rapporter/sandboxR sandboxR (not written by me) is a proof of concept for security inside R (as opposed to security outside R as discussed above) via evaluating all R commands in a specialized security environment (R environment that is) which contains safe replacements for blacklisted functions.
It is a good example of false security. For the reasons mentioned before this doesn't work and can be circumvented:
sandbox("XXXX('tail /etc/group')")
_developer:*:204: _locationd:*:205: _carddav:*:206: _detachedsig:*:207: _trustevaluationagent:*:208: _odchpass:*:209: _timezone:*:210: _lda:*:211: _cvms:*:212: _usbmuxd:*:213: [1] 0 The problem is that you can try to plug holes (and sandboxR is trying hard to plug a lot of them), but there will always be new ones. It's simply the wrong approach IMHO. Cheers, Simon
HTH, ~G On Wed, Dec 19, 2012 at 5:33 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
Jeroen has a package devoted to the sandboxing approach in conjunction with the system-level AppArmor facility: RAppArmor. See http://cran.r-project.org/web/packages/RAppArmor/index.html and more details at https://github.com/jeroenooms/RAppArmor#readme Dirk -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Gabriel Becker Graduate Student Statistics Department University of California, Davis [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20121219/4122aeac/attachment.pl>
On Dec 19, 2012, at 1:09 PM, Gabriel Becker wrote:
Simon, I don't really have a horse in this race (as I said I didn't write sandboxR), but it seems like if you control library (to prevent "untrusted" packages, specifically including things like Rcpp and Rffi), and dyn.load the executing arbitrary compiled code issue can be curtailed. If I'm wrong please let me know, I'm always looking to learn. I assume XXXX in your example was some C code you whipped up and then loaded using one of the methods above? Or a .Call to an existing internal R function?
No, it's pure R code, I just didn't want to put the exploit on the list ... Cheers, S
~G On Wed, Dec 19, 2012 at 9:12 AM, Simon Urbanek <simon.urbanek at r-project.org> wrote: On Dec 19, 2012, at 11:21 AM, Gabriel Becker wrote:
See also: https://github.com/Rapporter/sandboxR sandboxR (not written by me) is a proof of concept for security inside R (as opposed to security outside R as discussed above) via evaluating all R commands in a specialized security environment (R environment that is) which contains safe replacements for blacklisted functions.
It is a good example of false security. For the reasons mentioned before this doesn't work and can be circumvented:
sandbox("XXXX('tail /etc/group')")
_developer:*:204: _locationd:*:205: _carddav:*:206: _detachedsig:*:207: _trustevaluationagent:*:208: _odchpass:*:209: _timezone:*:210: _lda:*:211: _cvms:*:212: _usbmuxd:*:213: [1] 0 The problem is that you can try to plug holes (and sandboxR is trying hard to plug a lot of them), but there will always be new ones. It's simply the wrong approach IMHO. Cheers, Simon
HTH, ~G On Wed, Dec 19, 2012 at 5:33 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
Jeroen has a package devoted to the sandboxing approach in conjunction with the system-level AppArmor facility: RAppArmor. See http://cran.r-project.org/web/packages/RAppArmor/index.html and more details at https://github.com/jeroenooms/RAppArmor#readme Dirk -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
--
Gabriel Becker
Graduate Student
Statistics Department
University of California, Davis
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Gabriel Becker Graduate Student Statistics Department University of California, Davis
On Dec 19, 2012, at 1:10 PM, Simon Urbanek wrote:
On Dec 19, 2012, at 1:09 PM, Gabriel Becker wrote:
Simon, I don't really have a horse in this race (as I said I didn't write sandboxR), but it seems like if you control library (to prevent "untrusted" packages, specifically including things like Rcpp and Rffi), and dyn.load the executing arbitrary compiled code issue can be curtailed. If I'm wrong please let me know, I'm always looking to learn. I assume XXXX in your example was some C code you whipped up and then loaded using one of the methods above? Or a .Call to an existing internal R function?
No, it's pure R code, I just didn't want to put the exploit on the list ...
I just found another exploit that is harder to catch than the fist one. Both work by giving you unrestricted access directly to any function in any namespace or environment. The first one consists of 15 chars + the function name, the second takes 33 characters + function name (both work on base R, no extra packages needed). They use two entirely separate aspects of R to do it. I don't want to make this an exploit contest, I'm just trying to make the point that you cannot try to secure R by any kind of filtering or pre-processing, because the language is too flexible to make you feel secure. I would strongly discourage anyone from providing an open service that relies on filtering. There is a fairly high probability that the will be a way around it. That's why all realistic approaches work on the back-end. Cheers, Simon
~G On Wed, Dec 19, 2012 at 9:12 AM, Simon Urbanek <simon.urbanek at r-project.org> wrote: On Dec 19, 2012, at 11:21 AM, Gabriel Becker wrote:
See also: https://github.com/Rapporter/sandboxR sandboxR (not written by me) is a proof of concept for security inside R (as opposed to security outside R as discussed above) via evaluating all R commands in a specialized security environment (R environment that is) which contains safe replacements for blacklisted functions.
It is a good example of false security. For the reasons mentioned before this doesn't work and can be circumvented:
sandbox("XXXX('tail /etc/group')")
_developer:*:204: _locationd:*:205: _carddav:*:206: _detachedsig:*:207: _trustevaluationagent:*:208: _odchpass:*:209: _timezone:*:210: _lda:*:211: _cvms:*:212: _usbmuxd:*:213: [1] 0 The problem is that you can try to plug holes (and sandboxR is trying hard to plug a lot of them), but there will always be new ones. It's simply the wrong approach IMHO. Cheers, Simon
HTH, ~G On Wed, Dec 19, 2012 at 5:33 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
Jeroen has a package devoted to the sandboxing approach in conjunction with the system-level AppArmor facility: RAppArmor. See http://cran.r-project.org/web/packages/RAppArmor/index.html and more details at https://github.com/jeroenooms/RAppArmor#readme Dirk -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
--
Gabriel Becker
Graduate Student
Statistics Department
University of California, Davis
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Gabriel Becker Graduate Student Statistics Department University of California, Davis
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel