[Rcpp-devel] [Rd] must .Call C functions return SEXP?
On Thu, Oct 28, 2010 at 6:04 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
On Thu, Oct 28, 2010 at 1:44 PM, Dominick Samperi <djsamperi at gmail.com> wrote:
See comments on Rcpp below. On Thu, Oct 28, 2010 at 11:28 AM, William Dunlap <wdunlap at tibco.com>
wrote:
-----Original Message----- From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf Of Andrew Piskorski Sent: Thursday, October 28, 2010 6:48 AM To: Simon Urbanek Cc: r-devel at r-project.org Subject: Re: [Rd] must .Call C functions return SEXP? On Thu, Oct 28, 2010 at 12:15:56AM -0400, Simon Urbanek wrote:
Reason I ask, is I've written some R code which allocates two long lists, and then calls a C function with .Call. My C code
writes to
those two pre-allocated lists,
That's bad! All arguments are essentially read-only so you should never write into them!
I don't see how. (So, what am I missing?) The R docs themselves state that the main point of using .Call rather than .C is that .Call does not do any extra copying and gives one direct access to the R objects. (This is indeed very useful, e.g. to reorder a large matrix in seconds rather than hours.) I could allocate the two lists in my C code, but so far it was more convenient to so in R. What possible difference in behavior can there be between the two approaches?
Here is an example of how you break the rule that R-language functions
do not change their arguments if you use .Call in the way that you
describe. The C code is in alter_argument.c:
#include <R.h>
#include <Rinternals.h>
SEXP alter_argument(SEXP arg)
{
SEXP dim ;
PROTECT(dim = allocVector(INTSXP, 2));
INTEGER(dim)[0] = 1 ;
INTEGER(dim)[1] = LENGTH(arg) ;
setAttrib(arg, R_DimSymbol, dim);
UNPROTECT(1) ;
return dim ;
}
Make a shared library out of this. E.g., on Linux do
R CMD SHLIB -o Ralter_argument.so alter_argument.so
and load it into R with
dyn.open("./Ralter_argument.so")
(Or, on any platform, put it into a package along with
the following R code and build it.)
The associated R code is
myDim <- function(v).Call("alter_argument", v)
f <- function(z) myDim(z)[2]
Now try using it:
> myData <- 6:10
> myData
[1] 6 7 8 9 10
> f(myData)
[1] 5
> myData
[,1] [,2] [,3] [,4] [,5]
[1,] 6 7 8 9 10
The argument to f was changed! This should never happen in R.
If you are very careful you might be able ensure that
no part of the argument to be altered can come from
outside the function calling .Call(). It can be tricky
to ensure that, especially when the argument is more complicated
than an atomic vector.
"If you live outside the law you must be honest" - Bob Dylan.
This thread seems to suggest (following Bob Dylan) that one needs to be very careful when using C/C++ to modify R's memory directly, because you may modify other R variables that point to the same memory (due to R's copy-by-value semantics and optimizations). What are the implications for the Rcpp package where R objects are exposed to the C++ side in precisely this way, permitting unrestricted modifications? (In the original or "classic" version of this package direct writes to R's memory were done only for performance reasons.) Seems like extra precautions need to be taken to avoid the aliasing problem.
The current Rcpp facilities has the same benefits and dangers as the C macros used in .Call. You get access to the memory of the R object passed as an argument, saving a copy step. You shouldn't modify that memory. If you do, bad things can happen and they will be your fault. If you want to get a read-write copy you clone the argument (in Rcpp terminology). To Bill: I seem to remember the Dylan quote as "To live outside the law you must be honest."
And There are No Truths Outside the Gates of Eden. Cool, a Dylan thread... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20101029/28eebbc0/attachment-0001.htm>