GLM mixed model with quasibinomial family
[cc'ing back to r-sig-ecology] [Please keep sending replies to r-sig-ecology so that others may benefit from the conversation, and so that others can answer if I can't or am too busy (!)] I don't know what you mean by the "degree of fit to your data". Are you trying to do a goodness-of-fit test? The standard deviance-based test for goodness of fit/overdispersion that you may be thinking of (e.g. see if residual deviance/residual df approx. 1, or test residual deviance against a chi-squared distribution with df=(residual df)) only applies to NON-overdispersed models. You might want to get the book by Zuur et al on mixed models in ecology. Ben Bolker On Fri, Jul 30, 2010 at 11:49 AM, Javier Martinez
<javi.martinez.lopez at gmail.com> wrote:
Hello again Mr. Bolker, I have now tried glmmPQL and look very promising because I have in fact the expected results. Since I do not get a deviance parameter from these models I cannot assess their degree of fit to my data, so I was thinking if it would be possible to assess it somehow by doing a linear fit model between the expected and fitted values from the resulting glmmPQL model. Does it make sense to you? Thank you very much for any advise and regards, Javier On Thu, Jul 29, 2010 at 6:25 PM, Ben Bolker <bbolker at gmail.com> wrote:
?A little more information would probably be helpful. ?Here's what I'm guessing: ? You have no 'treatment' except the passage of time, and only two time points (say, before/after). You have a total of 16 measurements (2 each at 8 sites), they are like binomial data (number of counts of type x out of a total number N counted) but overdispersed. ?You want to test whether the proportion of type x changed between 'before' and 'after'. ?If the data were normally distributed, you could use a paired t-test. ?Is that a correct description? ?If so, then time should be treated as a fixed factor, group as random. 8 samples is probably enough (just). ? If your counts are fairly large (i.e. the minimum of the numbers of 'successes' and 'failures' in a typical group is >5) then you could safely use glmmPQL in the MASS package: ?glmmPQL(cbind(successes,failures)~time,random=~1|group, ? ? ? ? ? ? ?family="quasibinomial",data=...) ?Have you thought about simply using a nonparametric test on the proportions (i.e. wilcox.test(prop.before, prop.after,paired=TRUE) ... ?) On Thu, Jul 29, 2010 at 12:07 PM, Javier Martinez <javi.martinez.lopez at gmail.com> wrote:
Thanks to all of you! I did know the e-mail by Bates, which is out of my understanding, but I did not know the wiki on mixed models and the manuscript by Bolker! My data are based on 2 temporal samples from 8 different sites. I use mixed models because I want to avoid pseudo-replication including the grouping factor into my model and thus looking for the trends within each group and not looking at the data as if they were independent. The question is, can I really use a mixed model if I only have two cases per group? At the end there are 16 cases in the regression plot but I am not sure if such a grouped analysis is right! Thank you again for your help! Javier On Wed, Jul 28, 2010 at 6:44 PM, Javier Martinez <javi.martinez.lopez at gmail.com> wrote:
Dear R-users, I am using the 'lmer' function from package 'lme4', looking for a regression model which takes into account the grouped nature of my data. I am using frequencies as the dependent variable and percentages as the independent one. After some reading I think I should use the 'quasibinomial' family because there is 'overdispersion' in my data set (greater residual deviance than residual degrees of freedom). So, I test this regression model but I do not get a significance p-value for the regression! I have to test many different regressions with different data, so how can I assess the significance of each of of them? Thank you very much for your help! Javier