Skip to content

formal argument "envir" matched by multiple actual arguments

9 messages · Hervé Pagès, Henrik Bengtsson, Luke Tierney +1 more

#
Hi list,

This looks similar to the problem reported here
   https://stat.ethz.ch/pipermail/r-devel/2006-April/037199.html
by Henrik Bengtsson a long time ago. It is very sporadic and
non-reproducible.
Henrik, do you remember if your code was using reg.finalizer()?
I tend to suspect it but I'm not sure.

I've been hunting this bug for months but today, and we the help of other
Bioconductor users, I was able to isolate it and to write some code that
seems to "almost" reproduce it (i.e. not systematically but most of the
times).

(Just to put some context to the code below: it's a simplified version
of some more complex code that we use in Bioconductor to manage memory
caching of some big objects stored on disk. The idea is that objects of
class A can be named. All A objects with the same name form a group.
The code below implements a simple mechanism to trigger some action when
a group is completely removed from memory i.e. when the last object in
a group is garbage collected.)


   setClassUnion("environmentORNULL", c("environment", "NULL"))

   setClass("A",
     representation(
       aa="integer",
       groupname="character",
       groupanchor="environmentORNULL"
     )
   )

   .A.group.sizes <- new.env(hash=TRUE, parent=emptyenv())

   .inc.A.group.size <- function(groupname)
   {
     group.size <- 1L
     if (exists(groupname, envir=.A.group.sizes, inherits=FALSE))
         group.size <- group.size +
                       get(groupname, envir=.A.group.sizes, inherits=FALSE)
     assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE)
   }

   .dec.A.group.size <- function(groupname)
   {
     group.size <- get(groupname, envir=.A.group.sizes, inherits=FALSE) - 1L
     assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE)
     return(group.size)
   }

   newA <- function(groupname="")
   {
     a <- new("A", groupname=groupname)
     if (!identical(groupname, "")) {
         .inc.A.group.size(groupname)
         groupanchor <- new.env(parent=emptyenv())
         reg.finalizer(groupanchor,
                       function(e)
                       {
                           group.size <- .dec.A.group.size(groupname)
                           if (group.size == 0L) {
                               cat("no more object of group",
                                   groupname, "in memory\n")
                               # take some action
                           }
                       }
         )
         a at groupanchor <- groupanchor
     }
     return(a)
   }


The following commands seem to trigger the problem:

   > for (i in 1:2000) {a1 <- newA("group1")}
   > as.list(.A.group.sizes)
   > gc()
   > as.list(.A.group.sizes)
   > for (i in 1:2000) {a2 <- newA("group2")}
   Error in assign(".Method", method, envir = envir) :
     formal argument "envir" matched by multiple actual arguments

If it doesn't, then adding more rounds should finally do it:

   gc()
   for (i in 1:2000) {a3 <- newA("group3")}
   gc()
   for (i in 1:2000) {a4 <- newA("group4")}

   etc...

Thanks in advance for any help with this!

H.

 > sessionInfo()
R version 2.9.0 (2009-04-17)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
#
In fact reg.finalizer() looks like a dangerous feature.

If the finalizer itself triggers (implicitely or
explicitely) garbage collection, then bad things happen.
In the following example, garbage collection is triggered
explicitely (using R-2.9.0):

    setClass("B", representation(bb="environment"))

    newB <- function()
    {
      ans <- new("B", bb=new.env())
      reg.finalizer(ans at bb,
                    function(e)
                    {
                        gc()
                        cat("cleaning", class(ans), "object...\n")
                    }
      )
      return(ans)
    }

    > for (i in 1:500) {cat(i, "\n"); b1 <- newB()}
    1
    2
    3
    4
    5
    6
    ...
    13
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    14
    ...
    169
    170
    171
    Error: not a weak reference
    Error: not a weak reference
    [repeat the above line thousands of times]
    ...
    Error: not a weak reference
    Error: not a weak reference
    cleaning B object...
    Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
    Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
    [repeat the above line thousands of times]
    ...
    Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
    Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
    172
    ...
    246
    247
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...
    cleaning B object...

     *** caught segfault ***
    address 0x41, cause 'memory not mapped'

    Traceback:
     1: gc()
     2: function (e) {    gc()    cat("cleaning", class(ans),  
"object...\n")}(<environment>)

    Possible actions:
    1: abort (with core dump, if enabled)
    2: normal R exit
    3: exit R without saving workspace
    4: exit R saving workspace
    Selection: 2
    Save workspace image? [y/n/c]: n
    Segmentation fault

So apparently, if the finalizer triggers garbage collection,
then we can end up with a corrupted session. Then anything can
happen, from the strange 'formal argument "envir" matched by
multiple actual arguments' error I reported in the previous post,
to a segfault. In the worse case, nothing apparently happens but
the output produced by the code is wrong.

Maybe garbage collection requests should be ignored during the
execution of the finalizer? (and more generally during garbbage
collection itself)

Cheers,
H.
#
Hi.

2009/6/1 Herv? Pag?s <hpages at fhcrc.org>:
Yes.  This was/is observed with object extending the Object class of
R.oo, and the constructor of Object use reg.finalizer() [which then
calls finalize() that can be "overloaded"].  The fact that the garbage
collector is involved could explain why this bug(?) is hard to
reproduce.

It's been a while since I saw this problem (and we do instantiate way
more Object:s these days).  Looking at my source code comments and the
post you refers to, I suspect that I manage to circumvent the issue by
the following trick (looking at my code, I have several of those
statements):

envir2 <- envir
get(name, envir=envir2)

Also, on March 6, 2008 I reported to R-devel on a related problem with '%in%':

  http://tolstoy.newcastle.edu.au/R/e4/devel/08/03/0708.html

That one I circumvent by now only using is.element(a,b) instead of a %in% b.

Maybe this gives you further clues.

/Henrik

BTW. You need to be careful when you register a finalizer and that
uses code in a package, which may have been detached.  This may cause
an error in the finalizer which can give further side effects.  See
here:

  http://tolstoy.newcastle.edu.au/R/e2/devel/07/08/4251.html
#
Nice case - I think you're onto something. /Henrik

2009/6/2  <hpages at fhcrc.org>:
#
On Tue, 2 Jun 2009, Henrik Bengtsson wrote:

            
Thanks for the report.  The gc proper does not (or should not) do
anything that could cause allocation or trigger another gc.  The gc
proper only identifies objects ready for finalization; running the
finalizers happens outside the gc proper where allocation and gc calls
should be safe.  This looks like either a missing PROTECT call in the
code for running finalizers or possibly a more subltle bug in managing
the lists of objects in different states of finalization. I will look
more carefully when I get a chance.

luke

  
    
10 days later
#
On Tue, 2 Jun 2009, luke at stat.uiowa.edu wrote:

            
This is now fixed in R-devel and the R-patched (it was essentially a
missing PROTECT call).

luke

  
    
#
Thank you Luke!  I know you made many people happy by fixing this one,
especially over at BioC.

Is this a candidate for the contest of the bug that survived the
longest without being caught?

I reported on its symptoms in April 2006, but I think I first observed
them in 2003-2004 (thinking for a long time that it was a problem with
my code).

/Henrik
On Fri, Jun 12, 2009 at 9:01 AM, <luke at stat.uiowa.edu> wrote:
#
Great news ! This bug was driving me mad.

I have some scripts that had _high_ probability of running into the  
bug. I'll recompile and check it out over the weekend.

Kasper
On Jun 12, 2009, at 10:25 , Henrik Bengtsson wrote:

            
#
That's great news Luke! Thanks for finding and fixing that one!
I will do some testing and update the BioC build system with
the latest R.

H.
luke at stat.uiowa.edu wrote: