Skip to content
Prev 7524 / 20628 Next

hurdle model with glmmadmb

[cc'ing back to r-sig-mixed-models]

   There are a few different things going on here.

 (1) your attempt to drop data with zero seedlings failed, for the
following reason:
  (a) you defined new variables, seedling2, Tanual2, name2, etc. ...
*outside* of the datos2 data frame;
  (b) you passed data=subset(datos2,seedling2<0) to glmmADMB
  (c) but ... you used the new variables (seedling2 etc.) in your
formula, *not* names of variables from the data frame.  For example,
glmmADMB looks for a variable "seedling2" in the data frame specified by
the data= argument (which has been subsetted to remove the zero-seedling
cases); it doesn't find it, so it pulls the variable from the global
workspace.  But this variable (and the other variables) has *not* been
subsetted.

  I don't really know how to prevent this kind of error.  I could try to
make glmmADMB *only* look in the data frame specified by data= (at which
point you would get an error saying it couldn't find the 'seedling2'
variable or one of the other variables you specified), but that would be
a little bit tricky to program reliably, and is different (for better or
worse) from the way that the other modeling functions in R work (i.e.
they look first in 'data', then in other environments).  Checking for
length mismatches would work if you only specified *one* variable from
outside of the data frame, but not in the current case.  At least the
warning about zero cases alerts you that something is wrong ...

  Really the best advice is to try to manipulate variables *inside* the
data set, and keep things as clean as possible (see below).

  (2) if you did run glmmADMB with verbose=TRUE you would see the error:
42074072>=40000000
 No memory for dvar_vectors
 Need to increase ARRAY_MEMBLOCK_SIZE parameter

 This tells you the proximate reason why glmmADMB failed (although the
ultimate reason is as stated above).  There are 1717 total cases and
only 438 with seedlings>0, so this is a bigger data set.   If you did
want to run such a big model you would have to use extra.args="-ams
500000000" (I figured this out by poking around in the ADMB manual).
However, I had more trouble making the model work -- I stopped trying to
troubleshoot, knowing that I was working on the wrong data set anyway.

  (3) a couple of minor points: you may have trouble using 'name' as a
random effect, since it only has three levels; as long as you're going
to use the nesting syntax (name/transect/plot), you don't need to
construct the interaction terms yourself.

  Here is my recommended approach -- I manipulate the variables *only*
inside the data frame, and I do as little manipulation as I can get away
with (to keep things cleaner and easier to read).

## start from a CLEAN R session or rm(list=ls())
datos<-read.csv("regenerado_pisy.csv",header=TRUE,sep=";",dec=".")
datos2 <- transform(na.omit(datos),
                    name=factor(name),
                    transect=factor(transect),
                    plot=factor(plot))

library(glmmADMB)
seed_hurdle1<-glmmadmb(seedlings~I(Tmed_anual^2)+(1|name/transect/plot),
                       data=subset(datos2,seedlings>0),
                       family="truncnbinom1")
On 12-02-09 10:08 AM, Raquel Benavides wrote: