glmmADMB definition of levels for random component - R-SIG-mixed-models

Tue, Dec 13, 2011 1:45 AM #

First at all I would like to Thank Ben Bolker for his suggestion on how 
to install package glmmADMB in Linux, definitely it worked. Now I would 
ask for some
issued regarding the mixed model specification on my data.

As a reminder, say that I'm working with 2003-2010  fishing data 
(clustered data, Primary sampling unit [PSU]=trip, secondary sampling 
unit [SSU]=haul);
Specifically,  I am trying to model the numbers of fishes of a given 
species thrown back to sea (discarded individuals as the Ultimate 
sampling Unit [USU]) with a set of available
covariates. The main aim of  the model is to assess if  yearly increase 
in discards found in previous EDA,  is also found significant using a 
formal and well fitted model.  I fitted an starting model using (1| 
trip/ haul) as random component  (fishing hauls nested on trips, 
following the hierarchicalsamplingdesign):

mod.admb.1<-glmmadmb(ndiscarded~year + quarter+deep+duration+ 
(1|trip/haul), zeroInflation=TRUE,family="nbinom",data=dmod.1)

(I used zero inflation link  because previous study indicates that a 
model with zeroInflation=FALSE  does not account for the excess of zeros 
in the data).

"Trip" coding is

dat<-ddply(dat,.(year),function(x){
         levels(x$trip) <-1:length(levels(x$trip))
         return(data.frame(x))})

Meaning that trip levels restarts in every sampled year. Similar 
approach is used for hauls within trips (hauls coding restarts for every 
sampled trip).

But I am not sure of this codification, because using this coding means 
that I have same trips ID in every year, but obviously  they are not the 
same trips from one year to another (Perhaps the only similarity is that 
they are sampled in  same temporal interval each year).  As expected, 
random coefficients [ranef ()] from mod.admb.1does not  take into 
consideration that trip number "n" differs between years. I have some 
ideas to solve this problem:

a) To recode trip level with not regard of the sampled  year  (using 
factor(1: N) as levels, being N=total number of trips sampled across the 
years)
b) Fit models for each year separately and compare intercepts
c) use (1|year/trip/haul) as random component,

Can someone indicate a right practice to deal with this model 
specification problem ?


Juan Santos
Research Technician
Instituto Espa?ol de Oceanograf?a
C.O.Vigo

juan. santos at vi.ieo.es

Ben Bolker

Tue, Dec 27, 2011 10:53 AM #

Juan Santos <juan.santos at ...> writes:

As a reminder, say that I'm working with 2003-2010 fishing data
(clustered data, Primary sampling unit [PSU]=trip, secondary
sampling unit [SSU]=haul); Specifically, I am trying to model the
numbers of fishes of a given species thrown back to sea (discarded
individuals as the Ultimate sampling Unit [USU]) with a set of
available covariates. The main aim of the model is to assess if
yearly increase in discards found in previous EDA, is also found
significant using a formal and well fitted model.  I fitted an
starting model using (1| trip/ haul) as random component (fishing
hauls nested on trips, following the hierarchicalsamplingdesign):

mod.admb.1<-glmmadmb(ndiscarded~year + quarter+deep+duration+ 
(1|trip/haul), zeroInflation=TRUE,family="nbinom",data=dmod.1)

(I used zero inflation link  because previous study indicates that a 
model with zeroInflation=FALSE  does not account for the excess of zeros 
in the data).

"Trip" coding is

dat<-ddply(dat,.(year),function(x){
         levels(x$trip) <-1:length(levels(x$trip))
         return(data.frame(x))})

Meaning that trip levels restarts in every sampled year. Similar 
approach is used for hauls within trips (hauls coding restarts for every 
sampled trip).

But I am not sure of this codification, because using this coding means 
that I have same trips ID in every year, but obviously  they are not the 
same trips from one year to another (Perhaps the only similarity is that 
they are sampled in  same temporal interval each year).  As expected, 
random coefficients [ranef ()] from mod.admb.1does not  take into 
consideration that trip number "n" differs between years. I have some 
ideas to solve this problem:

a) To recode trip level with not regard of the sampled  year  (using 
factor(1: N) as levels, being N=total number of trips sampled across the 
years)
b) Fit models for each year separately and compare intercepts
c) use (1|year/trip/haul) as random component,

Can someone indicate a right practice to deal with this model 
specification problem ?

Did anyone ever answer you?  It seems as though this may have
slipped through the cracks.

  As it turns out you're right to be concerned in this case,
the model specification you have (~year + (1|trip/haul), leaving
out all the other stuff) does assume that trips are consistent
across years.  Your option (a) [recode to an "implicitly nested"
rather than "explicitly nested" coding] would work.  I don't
like option (b) because it would require fitting a separate parameter
for all of your other predictor variables in each year [i.e.
it would be equivalent to introducing a (quarter+deep+duration):year
interaction], which you may not want to do.  Option (c) doesn't
work so well either, because it introduces year as a random effect,
which (unless you have coded year as numeric in order to look for
a log-linear trend in by-catch with year) will be completely
redundant with the fixed effect of year in your model.  If you
don't want to use option (a), then I would recommend the following
option (d):

  ndiscarded~year+ [other fixed effects] + (1|year:trip)+(1|year:trip:haul)

which expands the trip/haul nesting (which would otherwise expand
to trip+trip:haul) to include an interaction with year.  If this gives
you trouble you could also do this even more explicitly by adding
a 'yeartrip' variable to your data set which was defined as 
interaction(year,trip) and similarly for 'yeartriphaul' ...