Hello, I'm ? ? a master student from ? ? Greece. I?m trying to model count data with ? ? GLMM (lme4 ? ? package), using as discrete response variable the number of parasites per fish and as categorical predictor variable three ? ? different species. ? ? I'm using as random effect the three different tanks I used and as fixed the infection level ?? . ? ? This is the model I'm running: mod ? ? <- ? ? glmer ? ? (parasite~species+(1|tank), ? ? family=poisson ?? , data=mydata) I noticed that the estimate of the intercept does not give the mean of the first species, so I ran a simple glm model to get the estimate. With summary() I got the p values that allow me to reject my hypothesis and continue ? ? to the Tukey test. Is it legal to use ? TukeyHSD(aov(parasite~species, data=mydata)) ? ? ? ? Finally I tested the assumptions ? ? and ? ? I found violation of normality and independence. I also tried MASS package where the assumption of independent residuals was not violated anymore but the histogram gave me a much more skewed distribution, but also anova() is not available for QTLs. mod2 <- glmmPQL (parasite~species, random=~1|tank, family=poisson, data=mydata) Thank you in advance for your help. Maria
clustered data with glmer() and glmmPQL()
4 messages · Marsela Alvanopoulou, Thierry Onkelinx
Dear Maria, The assumption of normality is only required for the residuals of linear (mixed) models, not for the residuals of generalised linear (mixed) models. You can't use aov() for two reasons: it assumes a Gaussian distribution and it assumes independent observations. mod1 and mod2 are in principle the same model (but fitted differently). Both assume the same correlation structure. 3 levels is not enough to get a sensible variance estimate for a random effect. See glmm wiki faq for more details. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2015-06-15 15:06 GMT+02:00 Marsela Alvanopoulou <marselalv at gmail.com>:
Hello,
I'm
? ?
a master student from
? ?
Greece. I?m trying to model count data with
? ?
GLMM (lme4
? ?
package), using as discrete response variable the number of parasites per
fish and as categorical predictor variable three
? ?
different species.
? ?
I'm using as random effect the three different tanks I used and as fixed
the infection level
??
.
?
?
This is the model I'm running:
mod
? ?
<-
? ?
glmer
? ?
(parasite~species+(1|tank),
? ?
family=poisson
??
, data=mydata)
I noticed that the estimate of the intercept does not give the mean of the
first species, so I ran a simple glm model to get the estimate. With
summary() I got the p values that allow me to reject my hypothesis and
continue
? ?
to the Tukey test. Is it legal to use
?
TukeyHSD(aov(parasite~species, data=mydata))
? ?
?
?
Finally I tested the assumptions
? ?
and
? ?
I found violation of normality and independence.
I also tried MASS package where the assumption of independent residuals was
not violated anymore but the histogram gave me a much more skewed
distribution, but also anova() is not available for QTLs.
mod2 <- glmmPQL (parasite~species, random=~1|tank, family=poisson,
data=mydata)
Thank you in advance for your help.
Maria
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Hi again, Thank you very much for your response. I found most of the answers in glmm wiki faq. I used MASS::glmmPQL for the model, car::Anova and multcomp::glht for the hypothesis testing and I still need some work to check the effect of the tank, if any. I also want to check if there is a significant difference on the number of parasites per gram (continuous response variable). I multiplied all values by 100 to get a discrete variable like before. Does that affect the final conclusions? Thanks again! On Mon, Jun 15, 2015 at 4:34 PM, Thierry Onkelinx <thierry.onkelinx at inbo.be> wrote:
Dear Maria, The assumption of normality is only required for the residuals of linear (mixed) models, not for the residuals of generalised linear (mixed) models. You can't use aov() for two reasons: it assumes a Gaussian distribution and it assumes independent observations. mod1 and mod2 are in principle the same model (but fitted differently). Both assume the same correlation structure. 3 levels is not enough to get a sensible variance estimate for a random effect. See glmm wiki faq for more details. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2015-06-15 15:06 GMT+02:00 Marsela Alvanopoulou <marselalv at gmail.com>:
Hello,
I'm
? ?
a master student from
? ?
Greece. I?m trying to model count data with
? ?
GLMM (lme4
? ?
package), using as discrete response variable the number of parasites per
fish and as categorical predictor variable three
? ?
different species.
? ?
I'm using as random effect the three different tanks I used and as fixed
the infection level
??
.
?
?
This is the model I'm running:
mod
? ?
<-
? ?
glmer
? ?
(parasite~species+(1|tank),
? ?
family=poisson
??
, data=mydata)
I noticed that the estimate of the intercept does not give the mean of the
first species, so I ran a simple glm model to get the estimate. With
summary() I got the p values that allow me to reject my hypothesis and
continue
? ?
to the Tukey test. Is it legal to use
?
TukeyHSD(aov(parasite~species, data=mydata))
? ?
?
?
Finally I tested the assumptions
? ?
and
? ?
I found violation of normality and independence.
I also tried MASS package where the assumption of independent residuals
was
not violated anymore but the histogram gave me a much more skewed
distribution, but also anova() is not available for QTLs.
mod2 <- glmmPQL (parasite~species, random=~1|tank, family=poisson,
data=mydata)
Thank you in advance for your help.
Maria
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Marsela Alvanopoulou MSc Student, Biology Dept., University of Bergen, Norway *e-mail: Marsela.Alvanopoulou at student.uib.no <Marsela.Alvanopoulou at student.uib.no>* *e-mail: marsela.alvanopoulou at imr.no <marsela.alvanopoulou at imr.no>* *linkedin: gr.linkedin.com/pub/marsela-alvanopoulou/69/3b3/410/ <http://gr.linkedin.com/pub/marsela-alvanopoulou/69/3b3/410/>* [[alternative HTML version deleted]]
You can use tank as a fixed effect instead of a random effect. In that case your model reduces to a general linear model. Personally I prefer a likelihood based model (glmer) over a penalised quasi-likelihood model (glmmPQL) unless I need things that are not available with glmer. You need to use the log(weight) of the fish as an offset factor instead of calculating the ratio. ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2015-06-16 1:54 GMT+02:00 Marsela Alvanopoulou <marselalv at gmail.com>:
Hi again, Thank you very much for your response. I found most of the answers in glmm wiki faq. I used MASS::glmmPQL for the model, car::Anova and multcomp::glht for the hypothesis testing and I still need some work to check the effect of the tank, if any. I also want to check if there is a significant difference on the number of parasites per gram (continuous response variable). I multiplied all values by 100 to get a discrete variable like before. Does that affect the final conclusions? Thanks again! On Mon, Jun 15, 2015 at 4:34 PM, Thierry Onkelinx < thierry.onkelinx at inbo.be> wrote:
Dear Maria, The assumption of normality is only required for the residuals of linear (mixed) models, not for the residuals of generalised linear (mixed) models. You can't use aov() for two reasons: it assumes a Gaussian distribution and it assumes independent observations. mod1 and mod2 are in principle the same model (but fitted differently). Both assume the same correlation structure. 3 levels is not enough to get a sensible variance estimate for a random effect. See glmm wiki faq for more details. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2015-06-15 15:06 GMT+02:00 Marsela Alvanopoulou <marselalv at gmail.com>:
Hello,
I'm
? ?
a master student from
? ?
Greece. I?m trying to model count data with
? ?
GLMM (lme4
? ?
package), using as discrete response variable the number of parasites per
fish and as categorical predictor variable three
? ?
different species.
? ?
I'm using as random effect the three different tanks I used and as fixed
the infection level
??
.
?
?
This is the model I'm running:
mod
? ?
<-
? ?
glmer
? ?
(parasite~species+(1|tank),
? ?
family=poisson
??
, data=mydata)
I noticed that the estimate of the intercept does not give the mean of
the
first species, so I ran a simple glm model to get the estimate. With
summary() I got the p values that allow me to reject my hypothesis and
continue
? ?
to the Tukey test. Is it legal to use
?
TukeyHSD(aov(parasite~species, data=mydata))
? ?
?
?
Finally I tested the assumptions
? ?
and
? ?
I found violation of normality and independence.
I also tried MASS package where the assumption of independent residuals
was
not violated anymore but the histogram gave me a much more skewed
distribution, but also anova() is not available for QTLs.
mod2 <- glmmPQL (parasite~species, random=~1|tank, family=poisson,
data=mydata)
Thank you in advance for your help.
Maria
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Marsela Alvanopoulou MSc Student, Biology Dept., University of Bergen, Norway *e-mail: Marsela.Alvanopoulou at student.uib.no <Marsela.Alvanopoulou at student.uib.no>* *e-mail: marsela.alvanopoulou at imr.no <marsela.alvanopoulou at imr.no>* *linkedin: gr.linkedin.com/pub/marsela-alvanopoulou/69/3b3/410/ <http://gr.linkedin.com/pub/marsela-alvanopoulou/69/3b3/410/>*