We (the lme4 authors) are having a problem with doing a proper deep
copy of a reference class object in recent versions of R-devel with
the LAZY_DUPLICATE_OK flag in src/main/bind.c enabled.
Apologies in advance for any improper terminology.
TL;DR Is there an elegant way to force non-lazy/deep copying in our
case? Is anyone else using reference classes with a field that is an
external pointer?
This is how copying of reference classes works in a normal
situation:
library(data.table) ## for address() function
setRefClass("defaultRC",fields="theta")
d1 <- new("defaultRC")
d1$theta <- 1
address(d1$theta) ## "0xbbbbb70"
d2 <- d1$copy()
address(d2$theta) ## same as above
d2$theta <- 2
address(d2$theta) ## now modified, by magic
d1$theta ## unmodified
The extra complication in our case is that many of the objects within
our reference class are actually accessed via an external pointer,
which is initialized when necessary -- details are copied below for
those who want them, or you can see the code at
https://github.com/lme4/lme4
The problem is that this sneaky way of copying the object's contents
doesn't trigger R's (new) rules for recognizing that a non-lazy copy
should be made.
library(lme4)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
pp <- fm1 at pp
pp$theta ## [1] 0.96673279 0.01516906 0.23090960
address(pp$theta) ## something
pp$Ptr ## <pointer: ...>
xpp <- pp$copy() ## default is deep copy
xpp$Ptr ## <pointer: (nil)>
address(xpp$theta) ## same as above
xpp$setTheta(c(0,0,0)) ## referenced through Ptr field
xpp$Ptr ## now set to non-nil
fm1 at pp$theta ## changes to (0,0,0). oops.
So apparently when the xpp$theta object is copied into the external
pointer, a reference/lazy copy is made. (xpp$theta itself is
read-only, so I can't do the assignment that way)
I can hack around this in a very ugly way by doing a trivial
modification when assigning inside the copy method:
assign("theta",get("theta",envir=selfEnv)+0, envir=vEnv)
... but (a) this is very ugly and (b) it seems very unsafe --
as R gets smarter it should start to recognize trivial changes
like x+0 and x*1 and *not* copy in these cases ...
Method details:
## from R/AllClass.R, merPredD RC definition
ptr = function() {
'returns the external pointer, regenerating if necessary'
if (length(theta)) {
if (.Call(isNullExtPtr, Ptr)) initializePtr()
}
Ptr
},
## ditto
initializePtr = function() {
Ptr <<- .Call(merPredDCreate, as(X, "matrix"), Lambdat,
LamtUt, Lind, RZX, Ut, Utr, V, VtV, Vtr,
Xwts, Zt, beta0, delb, delu, theta, u0)
...
}
merPredDCreate in turn just copies the relevant bits into a new C++
class object:
/* see src/external.cpp */
SEXP merPredDCreate(SEXP Xs, SEXP Lambdat, SEXP LamtUt, SEXP Lind,
SEXP RZX, SEXP Ut, SEXP Utr, SEXP V, SEXP VtV,
SEXP Vtr, SEXP Xwts, SEXP Zt, SEXP beta0,
SEXP delb, SEXP delu, SEXP theta, SEXP u0) {
BEGIN_RCPP;
merPredD *ans = new merPredD(Xs, Lambdat, LamtUt, Lind, RZX,
Ut, Utr, V, VtV,
Vtr, Xwts, Zt, beta0, delb, delu,
theta, u0);
return wrap(XPtr<merPredD>(ans, true));
END_RCPP;
}
reference classes, LAZY_DUPLICATE_OK, and external pointers
3 messages · Simon Urbanek, Ben Bolker
Ben,
On Mar 2, 2014, at 7:38 PM, Ben Bolker <bbolker at gmail.com> wrote:
We (the lme4 authors) are having a problem with doing a proper deep
copy of a reference class object in recent versions of R-devel with
the LAZY_DUPLICATE_OK flag in src/main/bind.c enabled.
Apologies in advance for any improper terminology.
TL;DR Is there an elegant way to force non-lazy/deep copying in our
case? Is anyone else using reference classes with a field that is an
external pointer?
This is how copying of reference classes works in a normal
situation:
library(data.table) ## for address() function
setRefClass("defaultRC",fields="theta")
d1 <- new("defaultRC")
d1$theta <- 1
address(d1$theta) ## "0xbbbbb70"
d2 <- d1$copy()
address(d2$theta) ## same as above
d2$theta <- 2
address(d2$theta) ## now modified, by magic
d1$theta ## unmodified
The extra complication in our case is that many of the objects within
our reference class are actually accessed via an external pointer,
which is initialized when necessary -- details are copied below for
those who want them, or you can see the code at
https://github.com/lme4/lme4
The problem is that this sneaky way of copying the object's contents
doesn't trigger R's (new) rules for recognizing that a non-lazy copy
should be made.
This is not R's decision - AFAICS your code is incorrectly assuming that there is no other reference where there is no such guarantee. Your code that assigns into the external pointer has to make that decision - it's not R's to make since you are taking the full responsibility for external pointers by circumventing R's handing. External pointers had always had reference semantics. Note that this is not new - you had to inspect the NAMED bits and call duplicate() yourself to guarantee a copy even in previous R versions. It just so happened that bugs of not doing so were often masked by R being more conservative such that in some circumstanced there were enough references to function arguments that R would defensively create a new copy. So, the same applies as it did before - if you store something that you want to be mutable in C/C++ you have to check the references and call duplicate() if you don't own the only reference. Cheers, Simon
library(lme4)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
pp <- fm1 at pp
pp$theta ## [1] 0.96673279 0.01516906 0.23090960
address(pp$theta) ## something
pp$Ptr ## <pointer: ...>
xpp <- pp$copy() ## default is deep copy
xpp$Ptr ## <pointer: (nil)>
address(xpp$theta) ## same as above
xpp$setTheta(c(0,0,0)) ## referenced through Ptr field
xpp$Ptr ## now set to non-nil
fm1 at pp$theta ## changes to (0,0,0). oops.
So apparently when the xpp$theta object is copied into the external
pointer, a reference/lazy copy is made. (xpp$theta itself is
read-only, so I can't do the assignment that way)
I can hack around this in a very ugly way by doing a trivial
modification when assigning inside the copy method:
assign("theta",get("theta",envir=selfEnv)+0, envir=vEnv)
... but (a) this is very ugly and (b) it seems very unsafe --
as R gets smarter it should start to recognize trivial changes
like x+0 and x*1 and *not* copy in these cases ...
Method details:
## from R/AllClass.R, merPredD RC definition
ptr = function() {
'returns the external pointer, regenerating if necessary'
if (length(theta)) {
if (.Call(isNullExtPtr, Ptr)) initializePtr()
}
Ptr
},
## ditto
initializePtr = function() {
Ptr <<- .Call(merPredDCreate, as(X, "matrix"), Lambdat,
LamtUt, Lind, RZX, Ut, Utr, V, VtV, Vtr,
Xwts, Zt, beta0, delb, delu, theta, u0)
...
}
merPredDCreate in turn just copies the relevant bits into a new C++
class object:
/* see src/external.cpp */
SEXP merPredDCreate(SEXP Xs, SEXP Lambdat, SEXP LamtUt, SEXP Lind,
SEXP RZX, SEXP Ut, SEXP Utr, SEXP V, SEXP VtV,
SEXP Vtr, SEXP Xwts, SEXP Zt, SEXP beta0,
SEXP delb, SEXP delu, SEXP theta, SEXP u0) {
BEGIN_RCPP;
merPredD *ans = new merPredD(Xs, Lambdat, LamtUt, Lind, RZX,
Ut, Utr, V, VtV,
Vtr, Xwts, Zt, beta0, delb, delu,
theta, u0);
return wrap(XPtr<merPredD>(ans, true));
END_RCPP;
}
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On 14-03-02 08:05 PM, Simon Urbanek wrote:
Ben, On Mar 2, 2014, at 7:38 PM, Ben Bolker <bbolker at gmail.com> wrote:
We (the lme4 authors) are having a problem with doing a proper
deep copy of a reference class object in recent versions of R-devel
with the LAZY_DUPLICATE_OK flag in src/main/bind.c enabled.
Apologies in advance for any improper terminology.
TL;DR Is there an elegant way to force non-lazy/deep copying in
our case? Is anyone else using reference classes with a field that
is an external pointer?
This is how copying of reference classes works in a normal
situation:
library(data.table) ## for address() function
setRefClass("defaultRC",fields="theta") d1 <- new("defaultRC")
d1$theta <- 1 address(d1$theta) ## "0xbbbbb70" d2 <- d1$copy()
address(d2$theta) ## same as above d2$theta <- 2 address(d2$theta)
## now modified, by magic d1$theta ## unmodified
The extra complication in our case is that many of the objects
within our reference class are actually accessed via an external
pointer, which is initialized when necessary -- details are copied
below for those who want them, or you can see the code at
https://github.com/lme4/lme4
The problem is that this sneaky way of copying the object's
contents doesn't trigger R's (new) rules for recognizing that a
non-lazy copy should be made.
This is not R's decision - AFAICS your code is incorrectly assuming that there is no other reference where there is no such guarantee. Your code that assigns into the external pointer has to make that decision - it's not R's to make since you are taking the full responsibility for external pointers by circumventing R's handing. External pointers had always had reference semantics. Note that this is not new - you had to inspect the NAMED bits and call duplicate() yourself to guarantee a copy even in previous R versions. It just so happened that bugs of not doing so were often masked by R being more conservative such that in some circumstanced there were enough references to function arguments that R would defensively create a new copy. So, the same applies as it did before - if you store something that you want to be mutable in C/C++ you have to check the references and call duplicate() if you don't own the only reference. Cheers, Simon
Thanks, that's extremely useful. Ben
library(lme4) fm1 <- lmer(Reaction ~ Days + (Days|Subject),
sleepstudy) pp <- fm1 at pp pp$theta ## [1] 0.96673279 0.01516906
0.23090960 address(pp$theta) ## something pp$Ptr ## <pointer: ...>
xpp <- pp$copy() ## default is deep copy xpp$Ptr ## <pointer:
(nil)> address(xpp$theta) ## same as above xpp$setTheta(c(0,0,0))
## referenced through Ptr field xpp$Ptr ## now set to non-nil
fm1 at pp$theta ## changes to (0,0,0). oops.
So apparently when the xpp$theta object is copied into the
external pointer, a reference/lazy copy is made. (xpp$theta
itself is read-only, so I can't do the assignment that way)
I can hack around this in a very ugly way by doing a trivial
modification when assigning inside the copy method:
assign("theta",get("theta",envir=selfEnv)+0, envir=vEnv)
... but (a) this is very ugly and (b) it seems very unsafe -- as R
gets smarter it should start to recognize trivial changes like x+0
and x*1 and *not* copy in these cases ...
Method details:
## from R/AllClass.R, merPredD RC definition
ptr = function() { 'returns the external pointer,
regenerating if necessary' if (length(theta)) { if
(.Call(isNullExtPtr, Ptr)) initializePtr() } Ptr },
## ditto
initializePtr = function() { Ptr <<- .Call(merPredDCreate, as(X,
"matrix"), Lambdat, LamtUt, Lind, RZX, Ut, Utr, V, VtV, Vtr, Xwts,
Zt, beta0, delb, delu, theta, u0) ... }
merPredDCreate in turn just copies the relevant bits into a new
C++ class object:
/* see src/external.cpp */
SEXP merPredDCreate(SEXP Xs, SEXP Lambdat, SEXP LamtUt, SEXP Lind,
SEXP RZX, SEXP Ut, SEXP Utr, SEXP V, SEXP VtV, SEXP Vtr, SEXP Xwts,
SEXP Zt, SEXP beta0, SEXP delb, SEXP delu, SEXP theta, SEXP u0) {
BEGIN_RCPP; merPredD *ans = new merPredD(Xs, Lambdat, LamtUt, Lind,
RZX, Ut, Utr, V, VtV, Vtr, Xwts, Zt, beta0, delb, delu, theta,
u0); return wrap(XPtr<merPredD>(ans, true)); END_RCPP; }
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel