An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20100818/b537ffc0/attachment.pl>
my first random effects logistic regression
7 messages · dieter.anseeuw, Luciano Selzer, David Duffy +4 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20100818/a98eed53/attachment.pl>
On Wed, 18 Aug 2010, dieter.anseeuw wrote:
A friend has inspected three randomly chosen farms (random factor 'farm'). At each farm three randomly chosen series of chickens (random factor 'flock' nested within 'farm') were each inspected for the presence of a certain bacteria. The contaminated chickens were counted (response variable 'positives'). The sample sizes per flock are given by the variable 'broilers'. We want to have a look at within-broilers, within-farm and between-farm variability.
model1<-glmer(positives~1 + (flock|farm), data=broilers.dat, family=binomial(link="logit"), weights=broilers)
Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1
glmer() would like to know how many tested chickens were negative. cbind(positives, negatives) ~ Cheers, David Duffy
| David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
Dear Dieter, You allready got the comments needed to get to code working. I would like to point to a flaw in your random effect specification. The correct specification for your design would be (1|farm/flock). However I would model it in this case as (1|farm:flock) becasue you have only 3 farms. Which is very low, so the variance estimates would not be very reliable. HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
-----Oorspronkelijk bericht----- Van: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] Namens dieter.anseeuw Verzonden: woensdag 18 augustus 2010 9:29 Aan: r-sig-mixed-models at r-project.org Onderwerp: [R-sig-ME] my first random effects logistic regression Hi all, I am new to using lme4 and only just discovered the mailing list (I already shortly introduced my problem in the epi-mailinglist, sorry for cross-posting, but my questions seem more appropriate for the mixed models discussion group). A friend has inspected three randomly chosen farms (random factor 'farm'). At each farm three randomly chosen series of chickens (random factor 'flock' nested within 'farm') were each inspected for the presence of a certain bacteria. The contaminated chickens were counted (response variable 'positives'). The sample sizes per flock are given by the variable 'broilers'. We want to have a look at within-broilers, within-farm and between-farm variability. This is the closest I get to analysing her data:
broilers.dat<-data.frame(farm=c("FA","FA","FA","FB","FB","FB","FC","FC
","FC"), flock=c("a","b","c","d","e","f","g","h","i"),
broilers=c(50,
rep(25,8)), positives=c(7,2,0,7,2,0,0,0,2))
library(lme4)
model1<-glmer(positives~1 + (flock|farm), data=broilers.dat, family=binomial(link="logit"), weights=broilers)
Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1 Hence, my code doesn't work. Could anybody help me out where and why I go wrong? Many thanks in advance, Dieter -- Dr. Ir. Dieter Anseeuw Katho Campus Roeselare Wilgenstraat 32 8800 Roeselare Belgium Direct phone: +32 51 23 29 68 http://www.katho.be/hivb http://www.linkedin.com/in/dieteranseeuw [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
Hi Dieter,
I was writing this as Thierry posted. My recommendation is similar,
although you wont be able to fit farm:flock in glmer because it will
complain about the number of random effects being equal to the number
of data. Anyhow....
I'm not sure if I have understood the sampling design correctly, but
it seems there are 9 flocks over 3 farms, 3 in each farm? If this is
the case you are going to run into problems with (flock|farm). This
specification is trying to estimate a 9X9 matrix which is the between
farm variance for each flock along the diagonals, and the between farm
covariances between flocks in the off-diagonals. This is 45
(co)variance parameters from 9 data points. I think I would try
something simpler:
model2<-glm(cbind(positives, broilers-positives)~farm,
data=broilers.dat, family=quasibinomial)
The quasi bit is capturing the flock effects, and the farm effects are
fitted as fixed. Another way to do it is to expand the binomial
response into a series of binary data and fit flock as a random effect:
broilers.dat2<-broilers.dat[rep(1:9, broilers.dat$broilers),]
broilers.dat2$positives<-
(broilers.dat2$positives>=unlist(sapply(broilers.dat$broilers,
function(x){1:x})))
tapply(broilers.dat2$positives, broilers.dat2$flock, sum) # check its
OK - it is
model3<-glmer(positives~farm+(1|flock), data=broilers.dat2,
family=binomial)
you could try farm as well, but with only 3 I would be careful
model4<-glmer(positives~(1|farm)+(1|flock), data=broilers.dat2,
family=binomial)
This model:
model5<-glmer(positives~(1|flock), data=broilers.dat2, family=binomial)
is equivalent to the one posted by Thierry , but it will run.
Cheers,
Jarrod
On 18 Aug 2010, at 08:28, dieter.anseeuw wrote:
Hi all, I am new to using lme4 and only just discovered the mailing list (I already shortly introduced my problem in the epi-mailinglist, sorry for cross-posting, but my questions seem more appropriate for the mixed models discussion group). A friend has inspected three randomly chosen farms (random factor 'farm'). At each farm three randomly chosen series of chickens (random factor 'flock' nested within 'farm') were each inspected for the presence of a certain bacteria. The contaminated chickens were counted (response variable 'positives'). The sample sizes per flock are given by the variable 'broilers'. We want to have a look at within-broilers, within-farm and between-farm variability. This is the closest I get to analysing her data:
broilers.dat<-
data.frame(farm=c("FA","FA","FA","FB","FB","FB","FC","FC","FC"),
flock=c("a","b","c","d","e","f","g","h","i"), broilers=c(50,
rep(25,8)), positives=c(7,2,0,7,2,0,0,0,2))
library(lme4)
model1<-glmer(positives~1 + (flock|farm), data=broilers.dat, family=binomial(link="logit"), weights=broilers)
Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1 Hence, my code doesn't work. Could anybody help me out where and why I go wrong? Many thanks in advance, Dieter -- Dr. Ir. Dieter Anseeuw Katho Campus Roeselare Wilgenstraat 32 8800 Roeselare Belgium Direct phone: +32 51 23 29 68 http://www.katho.be/hivb http://www.linkedin.com/in/dieteranseeuw [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
"JH" == Jarrod Hadfield <j.hadfield at ed.ac.uk>
on Wed, 18 Aug 2010 09:56:36 +0100 writes:
JH> Hi Dieter, I was writing this as Thierry posted. My
JH> recommendation is similar, although you wont be able to
JH> fit farm:flock in glmer because it will complain about
JH> the number of random effects being equal to the number
JH> of data. Anyhow....
That ("you wont be able ...") hasn't been true for a "while",
well at least since lme4 0.999375-34 has been released...
Martin
[.......................]
That ("you wont be able ...") hasn't been true for a "while",
well at least since lme4 0.999375-34 has been released...
Was going to mention that, too. Hence, glmer happily fits all the models
that have been proposed and they all give the same, or very similar,
estimates, even the very complex (given the data) ones:
######## the data
broilers.dat<-
data.frame(farm=c("FA","FA","FA","FB","FB","FB","FC","FC","FC"),
flock=c("a","b","c","d","e","f","g","h","i"), broilers=c(50,
rep(25,8)), positives=c(7,2,0,7,2,0,0,0,2))
## model 1 and 2 are identical because "flock" has got a unique identifier
model1<-glmer(cbind(positives, broilers-positives) ~ 1 + (1|farm:flock),
data=broilers.dat,
family=binomial(link="logit"))
model2 <-glmer(cbind(positives, broilers-positives) ~ 1 + (1|flock),
data=broilers.dat,
family=binomial(link="logit"))
### compare model 3 and the quasi-glm suggested by Jarrod
model3 <-glmer(cbind(positives, broilers-positives) ~ farm + (1|flock),
data=broilers.dat,
family=binomial(link="logit"))
glm3 <-glm(cbind(positives, broilers-positives)~farm,
data=broilers.dat, family=quasibinomial)
# response as binary
broilers.dat2<-broilers.dat[rep(1:9, broilers.dat$broilers),]
broilers.dat2$positives<-
(broilers.dat2$positives>=unlist(sapply(broilers.dat$broilers,
function(x){1:x})))
model4 <-glmer(positives~farm+(1|flock), data=broilers.dat2,
family=binomial)
model5<-glmer(positives~(1|farm)+(1|flock), data=broilers.dat2,
family=binomial)
model6 <-glmer(positives~ farm + (1|farm)+(1|flock), data=broilers.dat2,
family=binomial)
###########################
print(model1, corr=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: cbind(positives, broilers - positives) ~ 1 + (1 | farm:flock)
Data: broilers.dat
AIC BIC logLik deviance
24.17 24.56 -10.08 20.17
Random effects:
Groups Name Variance Std.Dev.
farm:flock (Intercept) 1.4175 1.1906
Number of obs: 9, groups: farm:flock, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.0486 0.5061 -6.024 1.70e-09 ***
---
###########################
print(model2, corr=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: cbind(positives, broilers - positives) ~ 1 + (1 | flock)
Data: broilers.dat
AIC BIC logLik deviance
24.17 24.56 -10.08 20.17
Random effects:
Groups Name Variance Std.Dev.
flock (Intercept) 1.4175 1.1906
Number of obs: 9, groups: flock, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.0486 0.5061 -6.024 1.70e-09 ***
---
###########################
print(model5, corr=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: positives ~ (1 | farm) + (1 | flock)
Data: broilers.dat2
AIC BIC logLik deviance
138.1 148.7 -66.06 132.1
Random effects:
Groups Name Variance Std.Dev.
flock (Intercept) 1.4175 1.1906
farm (Intercept) 0.0000 0.0000
Number of obs: 250, groups: flock, 9; farm, 3
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.0487 0.5061 -6.024 1.7e-09 ***
---
###########################
print(model3, corr=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: cbind(positives, broilers - positives) ~ farm + (1 | flock)
Data: broilers.dat
AIC BIC logLik deviance
26.25 27.04 -9.124 18.25
Random effects:
Groups Name Variance Std.Dev.
flock (Intercept) 0.87752 0.93676
Number of obs: 9, groups: flock, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.7264 0.6939 -3.929 8.54e-05 ***
farmFB 0.3737 0.9752 0.383 0.702
farmFC -1.2609 1.1981 -1.052 0.293
---
###########################
summary(glm3)
Call:
glm(formula = cbind(positives, broilers - positives) ~ farm,
family = quasibinomial, data = broilers.dat)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.5282 -1.1625 -0.6503 1.1514 2.1536
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.3136 0.6050 -3.824 0.00872 **
farmFB 0.3212 0.8629 0.372 0.72250
farmFC -1.2837 1.3806 -0.930 0.38835
---
(Dispersion parameter for quasibinomial family taken to be 2.997898)
Null deviance: 27.425 on 8 degrees of freedom
Residual deviance: 22.030 on 6 degrees of freedom
AIC: NA
Number of Fisher Scoring iterations: 5
###########################
print(model4, corr=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: positives ~ farm + (1 | flock)
Data: broilers.dat2
AIC BIC logLik deviance
138.2 152.3 -65.1 130.2
Random effects:
Groups Name Variance Std.Dev.
flock (Intercept) 0.87752 0.93676
Number of obs: 250, groups: flock, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.7264 0.6939 -3.929 8.54e-05 ***
farmFB 0.3737 0.9752 0.383 0.702
farmFC -1.2610 1.1981 -1.053 0.293
---
###########################
print(model6, corr=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: positives ~ farm + (1 | farm) + (1 | flock)
Data: broilers.dat2
AIC BIC logLik deviance
140.2 157.8 -65.1 130.2
Random effects:
Groups Name Variance Std.Dev.
flock (Intercept) 0.87752 0.93676
farm (Intercept) 0.00000 0.00000
Number of obs: 250, groups: flock, 9; farm, 3
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.7263 0.6939 -3.929 8.54e-05 ***
farmFB 0.3737 0.9752 0.383 0.702
farmFC -1.2611 1.1981 -1.053 0.293
---
############
sessionInfo()
R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lme4_0.999375-34 Matrix_0.999375-43 lattice_0.18-8 loaded via a namespace (and not attached): [1] grid_2.11.1 nlme_3.1-96 stats4_2.11.1 tools_2.11.1
########################################################################## Hope this is of any use. Cheers, Luca ----- Original Message ----- From: "Martin Maechler" <maechler at stat.math.ethz.ch> To: "Jarrod Hadfield" <j.hadfield at ed.ac.uk> Cc: <r-sig-mixed-models at r-project.org> Sent: Wednesday, August 18, 2010 5:49 AM Subject: Re: [R-sig-ME] my first random effects logistic regression
"JH" == Jarrod Hadfield <j.hadfield at ed.ac.uk>
on Wed, 18 Aug 2010 09:56:36 +0100 writes:
JH> Hi Dieter, I was writing this as Thierry posted. My
JH> recommendation is similar, although you wont be able to
JH> fit farm:flock in glmer because it will complain about
JH> the number of random effects being equal to the number
JH> of data. Anyhow....
That ("you wont be able ...") hasn't been true for a "while",
well at least since lme4 0.999375-34 has been released...
Martin
[.......................]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models