iterative list rm [Was: R-devel Digest, Vol 83, Issue 2]
On 01/03/2010 02:04 AM, Simon Urbanek wrote:
On Jan 2, 2010, at 5:41 PM, Romain Francois wrote:
On 01/02/2010 11:12 PM, Duncan Murdoch wrote:
On 02/01/2010 3:16 PM, Laurent Gautier wrote:
On 1/2/10 8:53 PM, Duncan Murdoch wrote:
Simon Urbanek wrote:
On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:
On 1/2/10 5:56 PM, Duncan Murdoch wrote:
On 02/01/2010 11:36 AM, Laurent Gautier wrote:
[Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-request at r-project.org wrote:
(...)
I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT.
PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack
UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application.
Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation.
Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop).
I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered.
Yes, I was referring to RecursiveRelease. Sorry if this was not clear. What are the chances for a patch to be accepted ? At first sight(*), making that tail recursion an iterative function is not a major undertaking, and reviewing the patch be fairly straightforward... but I can always use that time otherwise if the answer to the question is "nil".
I don't think I would want to review such a patch (I don't know the memory manager well, I don't know that there is really a case where it matters enough to be worth doing), so I'd say if you don't get a message from a core member volunteering to do so, you should assume it won't be accepted. But that doesn't mean you shouldn't write the code for your own internal use and edification, and if you can put together a demo that shows it really makes a big difference in a realistic situation, you might get a different response. Duncan Murdoch
From what I understand, this has little to do with the memory manager, and resumes to the simple problem "how to remove an object from a list".
Something like this, using the amazing inline and inspect packages:
require( inline )
require( inspect )
remover<- cfunction(signature( list = "language", object = "environment" ), '
if( !isNull( list ) ){
SEXP x = list ;
SEXP y ;
while( CAR(x) != object&& CADR(x) != R_NilValue ){
y = x ;
x = CDR(x) ;
}
if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
}
return list ;
That blows up if the object is the first in the list BTW. You probably meant more something like
static SEXP ReleaseObj(SEXP object, SEXP list) {
if (!isNull(list)) {
if (CAR(list) != object) {
SEXP c = list;
while (!isNull(CDR(c))) {
if (CADR(c) == object) {
CDR(c) = CDDR(c);
break;
}
c = CDR(c);
}
} else return CDR(list);
}
return list;
}
(For non-core replace CDR(c) with SETCDR(c) although that goes through the write barrier).
Cheers,
Simon
Thanks. And sorry for not testing this case. (oops) Romain
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex