GLM mixed model with quasibinomial family

Fri, Jul 30, 2010 1:22 PM

[cc'ing back to r-sig-ecology]
  [Please keep sending replies to r-sig-ecology so that others may
benefit from the conversation, and so that others can answer if I
can't or am too busy (!)]

  I don't know what you mean by the "degree of fit to your data". Are
you trying to do a goodness-of-fit test? The standard deviance-based
test for goodness of fit/overdispersion that you may be thinking of
(e.g. see if residual deviance/residual df approx. 1, or test residual
deviance against a chi-squared distribution with df=(residual df))
only applies to NON-overdispersed models.

  You might want to get the book by Zuur et al on mixed models in ecology.

  Ben Bolker

On Fri, Jul 30, 2010 at 11:49 AM, Javier Martinez

<javi.martinez.lopez at gmail.com> wrote:

Hello again Mr. Bolker,

I have now tried glmmPQL and look very promising because I have in
fact the expected results. Since I do not get a deviance parameter
from these models I cannot assess their degree of fit to my data, so I
was thinking if it would be possible to assess it somehow by doing a
linear fit model between the expected and fitted values from the
resulting glmmPQL model. Does it make sense to you?

Thank you very much for any advise and regards,

Javier

On Thu, Jul 29, 2010 at 6:25 PM, Ben Bolker <bbolker at gmail.com> wrote:

?A little more information would probably be helpful. ?Here's what
I'm guessing:

? You have no 'treatment' except the passage of time, and only two
time points (say, before/after). You have a total of 16 measurements
(2 each at 8 sites), they are like binomial data (number of counts of
type x out of a total number N counted) but overdispersed. ?You want
to test whether the proportion of type x changed between 'before' and
'after'. ?If the data were normally distributed, you could use a paired
t-test.

?Is that a correct description?

?If so, then time should be treated as a fixed factor, group as random.
8 samples is probably enough (just).

? If your counts are fairly large (i.e. the minimum of the numbers
of 'successes' and 'failures' in a typical group is >5) then you could
safely use glmmPQL in the MASS package:

?glmmPQL(cbind(successes,failures)~time,random=~1|group,
? ? ? ? ? ? ?family="quasibinomial",data=...)

?Have you thought about simply using a nonparametric test on the
proportions (i.e. wilcox.test(prop.before, prop.after,paired=TRUE) ... ?)

On Thu, Jul 29, 2010 at 12:07 PM, Javier Martinez
<javi.martinez.lopez at gmail.com> wrote:

Thanks to all of you! I did know the e-mail by Bates, which is out of
my understanding, but I did not know the wiki on mixed models and the
manuscript by Bolker! My data are based on 2 temporal samples from 8
different sites. I use mixed models because I want to avoid
pseudo-replication including the grouping factor into my model and
thus looking for the trends within each group and not looking at the
data as if they were independent. The question is, can I really use a
mixed model if I only have two cases per group? At the end there are
16 cases in the regression plot but I am not sure if such a grouped
analysis is right!

Thank you again for your help!

Javier

On Wed, Jul 28, 2010 at 6:44 PM, Javier Martinez
<javi.martinez.lopez at gmail.com> wrote:

Dear R-users,

I am using the 'lmer' function from package 'lme4', looking for a
regression model which takes into account the grouped nature of my
data. I am using frequencies as the dependent variable and percentages
as the independent one. After some reading I think I should use the
'quasibinomial' family because there is 'overdispersion' in my data
set (greater residual deviance than residual degrees of freedom). So,
I test this regression model but I do not get a significance p-value
for the regression! I have to test many different regressions with
different data, so how can I assess the significance of each of of
them?

Thank you very much for your help!

Javier