First at all I would like to Thank Ben Bolker for his suggestion on how
to install package glmmADMB in Linux, definitely it worked. Now I would
ask for some
issued regarding the mixed model specification on my data.
As a reminder, say that I'm working with 2003-2010 fishing data
(clustered data, Primary sampling unit [PSU]=trip, secondary sampling
unit [SSU]=haul);
Specifically, I am trying to model the numbers of fishes of a given
species thrown back to sea (discarded individuals as the Ultimate
sampling Unit [USU]) with a set of available
covariates. The main aim of the model is to assess if yearly increase
in discards found in previous EDA, is also found significant using a
formal and well fitted model. I fitted an starting model using (1|
trip/ haul) as random component (fishing hauls nested on trips,
following the hierarchicalsamplingdesign):
mod.admb.1<-glmmadmb(ndiscarded~year + quarter+deep+duration+
(1|trip/haul), zeroInflation=TRUE,family="nbinom",data=dmod.1)
(I used zero inflation link because previous study indicates that a
model with zeroInflation=FALSE does not account for the excess of zeros
in the data).
"Trip" coding is
dat<-ddply(dat,.(year),function(x){
levels(x$trip) <-1:length(levels(x$trip))
return(data.frame(x))})
Meaning that trip levels restarts in every sampled year. Similar
approach is used for hauls within trips (hauls coding restarts for every
sampled trip).
But I am not sure of this codification, because using this coding means
that I have same trips ID in every year, but obviously they are not the
same trips from one year to another (Perhaps the only similarity is that
they are sampled in same temporal interval each year). As expected,
random coefficients [ranef ()] from mod.admb.1does not take into
consideration that trip number "n" differs between years. I have some
ideas to solve this problem:
a) To recode trip level with not regard of the sampled year (using
factor(1: N) as levels, being N=total number of trips sampled across the
years)
b) Fit models for each year separately and compare intercepts
c) use (1|year/trip/haul) as random component,
Can someone indicate a right practice to deal with this model
specification problem ?
Juan Santos
Research Technician
Instituto Espa?ol de Oceanograf?a
C.O.Vigo
juan. santos at vi.ieo.es
glmmADMB definition of levels for random component
2 messages · Juan Santos, Ben Bolker
14 days later
Juan Santos <juan.santos at ...> writes:
As a reminder, say that I'm working with 2003-2010 fishing data (clustered data, Primary sampling unit [PSU]=trip, secondary sampling unit [SSU]=haul); Specifically, I am trying to model the numbers of fishes of a given species thrown back to sea (discarded individuals as the Ultimate sampling Unit [USU]) with a set of available covariates. The main aim of the model is to assess if yearly increase in discards found in previous EDA, is also found significant using a formal and well fitted model. I fitted an starting model using (1| trip/ haul) as random component (fishing hauls nested on trips, following the hierarchicalsamplingdesign):
mod.admb.1<-glmmadmb(ndiscarded~year + quarter+deep+duration+
(1|trip/haul), zeroInflation=TRUE,family="nbinom",data=dmod.1)
(I used zero inflation link because previous study indicates that a
model with zeroInflation=FALSE does not account for the excess of zeros
in the data).
"Trip" coding is
dat<-ddply(dat,.(year),function(x){
levels(x$trip) <-1:length(levels(x$trip))
return(data.frame(x))})
Meaning that trip levels restarts in every sampled year. Similar
approach is used for hauls within trips (hauls coding restarts for every
sampled trip).
But I am not sure of this codification, because using this coding means
that I have same trips ID in every year, but obviously they are not the
same trips from one year to another (Perhaps the only similarity is that
they are sampled in same temporal interval each year). As expected,
random coefficients [ranef ()] from mod.admb.1does not take into
consideration that trip number "n" differs between years. I have some
ideas to solve this problem:
a) To recode trip level with not regard of the sampled year (using
factor(1: N) as levels, being N=total number of trips sampled across the
years)
b) Fit models for each year separately and compare intercepts
c) use (1|year/trip/haul) as random component,
Can someone indicate a right practice to deal with this model
specification problem ?
Did anyone ever answer you? It seems as though this may have slipped through the cracks. As it turns out you're right to be concerned in this case, the model specification you have (~year + (1|trip/haul), leaving out all the other stuff) does assume that trips are consistent across years. Your option (a) [recode to an "implicitly nested" rather than "explicitly nested" coding] would work. I don't like option (b) because it would require fitting a separate parameter for all of your other predictor variables in each year [i.e. it would be equivalent to introducing a (quarter+deep+duration):year interaction], which you may not want to do. Option (c) doesn't work so well either, because it introduces year as a random effect, which (unless you have coded year as numeric in order to look for a log-linear trend in by-catch with year) will be completely redundant with the fixed effect of year in your model. If you don't want to use option (a), then I would recommend the following option (d): ndiscarded~year+ [other fixed effects] + (1|year:trip)+(1|year:trip:haul) which expands the trip/haul nesting (which would otherwise expand to trip+trip:haul) to include an interaction with year. If this gives you trouble you could also do this even more explicitly by adding a 'yeartrip' variable to your data set which was defined as interaction(year,trip) and similarly for 'yeartriphaul' ...