There's a fairly recent paper by Zhang et al (2011) of interest to folks on this list DOI: 10.1002/sim.4265 In response to a post on the AD Model Builder users' list, I took a quick shot at re-doing some of their results (they have extensive simulation results, which I haven't tried to replicate yet, and an analysis of binary data from Davis (1991) which is included (I *think* it's the same data set -- the description and size of the data set match exactly) in the geepack data set). If anyone's interested, my results so far are posted at http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf So far the R approaches I've tried agree closely with each other and with glmmADMB (except MASS::glmmPQL, which I expected to be different -- the rest all use either Laplace approx. or AGHQ). They *don't* agree with the results Zhang et al got, yet -- I'm sure there's something I'm missing in the contrasts or otherwise ... Suggestions or improvements are welcome. cheers Ben Bolker
Zhang 2011 (re)analysis
7 messages · Ben Bolker, Reinhold Kliegl, Saang-Yoon Hyun +1 more
On 11-10-31 05:22 AM, Reinhold Kliegl wrote:
One problem appears to be that 111 id's are renumbered from 1 to 55 (56) in the two groups. Unfortunately, it also appears that there is no unique mapping to treatment groups. So there are some subjects with 8 values assigned to one of the groups.
Thanks. It looks like IDs are nested within center (not within treatment). That doesn't seem to change the story very much (as far as , though (Zhang et al don't report estimated random-effect variances ...)
library(geepack) data(respiratory) resp1 <- respiratory resp1 <- transform(resp1,
+ center=factor(center), + id=factor(id))
str(resp1)
'data.frame': 444 obs. of 8 variables: $ center : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... $ id : Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ... $ treat : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ... $ sex : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ... $ age : int 46 46 46 46 28 28 28 28 23 23 ... $ baseline: int 0 0 0 0 0 0 0 0 1 1 ... $ visit : int 1 2 3 4 1 2 3 4 1 2 ... $ outcome : int 0 0 0 0 0 0 0 0 1 1 ...
detach("package:geepack") ## allow detaching of doBy
detach("package:doBy") ## allow detaching of lme4
The data appear also in the HSAUR package, here the 111 subjects identified with 5 months (visits) each. I suspect month 0 was used as baseline.
library(HSAUR) data(respiratory) resp2 <- respiratory str(resp2)
'data.frame': 555 obs. of 7 variables: $ centre : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... $ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1 1 1 1 1 1 ... $ sex : Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1 ... $ age : num 46 46 46 46 46 28 28 28 28 28 ... $ status : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1 1 ... $ month : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4 5 1 2 3 4 5 ... $ subject : Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 2 ... Reinhold On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com> wrote:
There's a fairly recent paper by Zhang et al (2011) of interest to folks on this list DOI: 10.1002/sim.4265 In response to a post on the AD Model Builder users' list, I took a quick shot at re-doing some of their results (they have extensive simulation results, which I haven't tried to replicate yet, and an analysis of binary data from Davis (1991) which is included (I *think* it's the same data set -- the description and size of the data set match exactly) in the geepack data set). If anyone's interested, my results so far are posted at http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf So far the R approaches I've tried agree closely with each other and with glmmADMB (except MASS::glmmPQL, which I expected to be different -- the rest all use either Laplace approx. or AGHQ). They *don't* agree with the results Zhang et al got, yet -- I'm sure there's something I'm missing in the contrasts or otherwise ... Suggestions or improvements are welcome. cheers Ben Bolker
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On 11-10-31 07:49 PM, Saang-Yoon Hyun wrote:
Hi, Could you show the reference: e.g., authors, year, tilte, journal, pages, etc.? If you could share its PDF version of the paper, it would be more helpful. Thank you, Saang-Yoon
DOI: 10.1002/sim.4265 On fitting generalized linear mixed-effects models for binary responses using different statistical packages Hui Zhang, Naiji Lu, Changyong Feng, Sally W. Thurston, Yinglin Xia, Liang Zhua and Xin M. Tu Statistics in Medicine. Apparently no page numbers (online publication?)
On Mon, Oct 31, 2011 at 12:15 PM, Reinhold Kliegl
<reinhold.kliegl at gmail.com <mailto:reinhold.kliegl at gmail.com>> wrote:
Here is some comparison between glm, glmer (using lme4 with Laplace)
and sabreR (which uses AGHQ, if I recall correctly).
The glm analysis replicates exactly the results reported in Everitt
and Hothorn (2002, Table 13.1, around page 175). Thus, I am pretty
sure I am using the correct data.
sabreR gives estimates both for the standard homogeneous model,
replicating the glm(), as well as a random effects model, replicating
glmer() pretty closely, I think
Reinhold
> # HSAUR contains respiratory data
> data("respiratory", package = "HSAUR")
>
> # Convert pretest to covariate "baseline"
> resp <- subset(respiratory, month > "0" )
> resp$baseline <- rep(subset(respiratory, month == "0")$status, each=4)
>
> # Numeric variants of factors
> resp$centr.b <- as.integer(resp$centre) - 1
> resp$treat.b <- as.integer(resp$treatment) - 1
> resp$sex.b <- as.integer(resp$sex)-1
> resp$pre.b <- as.integer(resp$baseline) - 1
> resp$status.b <- as.integer(resp$status) - 1
> resp$id <- as.integer(resp$subject)
>
> # Pooled estimate
> summary(m_glm.b <-glm(status.b ~ centr.b + treat.b + sex.b + pre.b
+ age, family="binomial", data=resp))
Call:
glm(formula = status.b ~ centr.b + treat.b + sex.b + pre.b +
age, family = "binomial", data = resp)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.3146 -0.8551 0.4336 0.8953 1.9246
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.900171 0.337653 -2.666 0.00768 **
centr.b 0.671601 0.239567 2.803 0.00506 **
treat.b 1.299216 0.236841 5.486 4.12e-08 ***
sex.b 0.119244 0.294671 0.405 0.68572
pre.b 1.882029 0.241290 7.800 6.20e-15 ***
age -0.018166 0.008864 -2.049 0.04043 *
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 608.93 on 443 degrees of freedom
Residual deviance: 483.22 on 438 degrees of freedom
AIC: 495.22
Number of Fisher Scoring iterations: 4
>
> # glmer
> library(lme4)
> print(m_glmer_4.L <- glmer(status ~ centre + treatment + sex +
baseline + age + (1|subject),
+ family=binomial,data=resp), cor=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: status ~ centre + treatment + sex + baseline + age + (1 |
subject)
Data: resp
AIC BIC logLik deviance
443 471.7 -214.5 429
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 3.8647 1.9659 <tel:3.8647%20%20%201.9659>
Number of obs: 444, groups: subject, 111
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.64438 0.75829 -2.169 0.0301 *
centre2 1.04382 0.53193 1.962 0.0497 *
treatmenttreatment 2.15746 0.51757 4.168 3.07e-05 ***
sexmale 0.20194 0.66117 0.305 0.7600
baselinegood 3.06990 0.52608 5.835 5.37e-09 ***
age -0.02540 0.01998 -1.271 0.2037
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> # Pooled and clustered
> library(sabreR)
> attach(resp)
> m_sabreR <- sabre(status.b ~ centr.b + treat.b + sex.b + pre.b +
age, case=id)
> print(m_sabreR)
# ... deleted some output
(Standard Homogenous Model)
Parameter Estimate Std. Err. Z-score
____________________________________________________________________
(intercept) -0.90017 0.33765 -2.6660
centr.b 0.67160 0.23957 2.8034
treat.b 1.2992 0.23684 5.4856
sex.b 0.11924 0.29467 0.40467
pre.b 1.8820 0.24129 7.7999
age -0.18166E-01 0.88644E-02 -2.0493
(Random Effects Model)
Parameter Estimate Std. Err. Z-score
____________________________________________________________________
(intercept) -1.6642 0.84652 -1.9660
centr.b 0.99044 0.56561 1.7511
treat.b 2.1265 0.57198 3.7177
sex.b 0.18166 0.70814 0.25653
pre.b 2.9987 0.60174 4.9834
age -0.22949E-01 0.21337E-01 -1.0755
scale 1.9955 0.32093 6.2180
# ... deleted some output
> detach()
On Mon, Oct 31, 2011 at 2:10 AM, Ben Bolker <bbolker at gmail.com
<mailto:bbolker at gmail.com>> wrote:
> On 11-10-31 05:22 AM, Reinhold Kliegl wrote:
>> One problem appears to be that 111 id's are renumbered from 1 to 55
>> (56) in the two groups.
>> Unfortunately, it also appears that there is no unique mapping to
>> treatment groups. So there are some subjects with 8 values
assigned to
>> one of the groups.
>
> Thanks. It looks like IDs are nested within center (not within
> treatment). That doesn't seem to change the story very much (as
far as
> , though (Zhang et al don't report estimated random-effect
variances ...)
>
>
>
>>> library(geepack)
>>> data(respiratory)
>>> resp1 <- respiratory
>>> resp1 <- transform(resp1,
>> + center=factor(center),
>> + id=factor(id))
>>>
>>> str(resp1)
>> 'data.frame': 444 obs. of 8 variables:
>> $ center : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>> $ id : Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2
2 2 3 3 ...
>> $ treat : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ...
>> $ sex : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
>> $ age : int 46 46 46 46 28 28 28 28 23 23 ...
>> $ baseline: int 0 0 0 0 0 0 0 0 1 1 ...
>> $ visit : int 1 2 3 4 1 2 3 4 1 2 ...
>> $ outcome : int 0 0 0 0 0 0 0 0 1 1 ...
>>> detach("package:geepack") ## allow detaching of doBy
>>> detach("package:doBy") ## allow detaching of lme4
>>
>> The data appear also in the HSAUR package, here the 111 subjects
>> identified with 5 months (visits) each. I suspect month 0 was
used as
>> baseline.
>>> library(HSAUR)
>>> data(respiratory)
>>> resp2 <- respiratory
>>>
>>> str(resp2)
>> 'data.frame': 555 obs. of 7 variables:
>> $ centre : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>> $ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1
1 1 1 1 1 ...
>> $ sex : Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1
1 1 ...
>> $ age : num 46 46 46 46 46 28 28 28 28 28 ...
>> $ status : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1
1 ...
>> $ month : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4
5 1 2 3 4 5 ...
>> $ subject : Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1
2 2 2 2 2 ...
>>
>> Reinhold
>>
>> On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com
<mailto:bbolker at gmail.com>> wrote:
>>>
>>> There's a fairly recent paper by Zhang et al (2011) of interest to
>>> folks on this list
>>>
>>> DOI: 10.1002/sim.4265
>>>
>>> In response to a post on the AD Model Builder users' list, I took a
>>> quick shot at re-doing some of their results (they have extensive
>>> simulation results, which I haven't tried to replicate yet, and an
>>> analysis of binary data from Davis (1991) which is included (I
*think*
>>> it's the same data set -- the description and size of the data
set match
>>> exactly) in the geepack data set).
>>>
>>> If anyone's interested, my results so far are posted at
>>>
>>> http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw
>>> http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf
>>>
>>> So far the R approaches I've tried agree closely with each
other and
>>> with glmmADMB (except MASS::glmmPQL, which I expected to be
different --
>>> the rest all use either Laplace approx. or AGHQ). They *don't*
agree
>>> with the results Zhang et al got, yet -- I'm sure there's
something I'm
>>> missing in the contrasts or otherwise ...
>>>
>>> Suggestions or improvements are welcome.
>>>
>>> cheers
>>> Ben Bolker
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>
>
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
One problem appears to be that 111 id's are renumbered from 1 to 55 (56) in the two groups. Unfortunately, it also appears that there is no unique mapping to treatment groups. So there are some subjects with 8 values assigned to one of the groups.
library(geepack) data(respiratory) resp1 <- respiratory resp1 <- transform(resp1,
+ center=factor(center), + id=factor(id))
str(resp1)
'data.frame': 444 obs. of 8 variables: $ center : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... $ id : Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ... $ treat : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ... $ sex : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ... $ age : int 46 46 46 46 28 28 28 28 23 23 ... $ baseline: int 0 0 0 0 0 0 0 0 1 1 ... $ visit : int 1 2 3 4 1 2 3 4 1 2 ... $ outcome : int 0 0 0 0 0 0 0 0 1 1 ...
detach("package:geepack") ## allow detaching of doBy
detach("package:doBy") ## allow detaching of lme4
The data appear also in the HSAUR package, here the 111 subjects identified with 5 months (visits) each. I suspect month 0 was used as baseline.
library(HSAUR) data(respiratory) resp2 <- respiratory str(resp2)
'data.frame': 555 obs. of 7 variables: $ centre : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... $ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1 1 1 1 1 1 ... $ sex : Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1 ... $ age : num 46 46 46 46 46 28 28 28 28 28 ... $ status : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1 1 ... $ month : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4 5 1 2 3 4 5 ... $ subject : Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 2 ... Reinhold
On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com> wrote:
?There's a fairly recent paper by Zhang et al (2011) of interest to folks on this list DOI: 10.1002/sim.4265 ?In response to a post on the AD Model Builder users' list, I took a quick shot at re-doing some of their results (they have extensive simulation results, which I haven't tried to replicate yet, and an analysis of binary data from Davis (1991) which is included (I *think* it's the same data set -- the description and size of the data set match exactly) in the geepack data set). ?If anyone's interested, my results so far are posted at http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf ?So far the R approaches I've tried agree closely with each other and with glmmADMB (except MASS::glmmPQL, which I expected to be different -- the rest all use either Laplace approx. or AGHQ). ?They *don't* agree with the results Zhang et al got, yet -- I'm sure there's something I'm missing in the contrasts or otherwise ... ?Suggestions or improvements are welcome. ?cheers ? ?Ben Bolker
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Here is some comparison between glm, glmer (using lme4 with Laplace) and sabreR (which uses AGHQ, if I recall correctly). The glm analysis replicates exactly the results reported in Everitt and Hothorn (2002, Table 13.1, around page 175). Thus, I am pretty sure I am using the correct data. sabreR gives estimates both for the standard homogeneous model, replicating the glm(), as well as a random effects model, replicating glmer() pretty closely, I think Reinhold
# HSAUR contains respiratory data
data("respiratory", package = "HSAUR")
# Convert pretest to covariate "baseline"
resp <- subset(respiratory, month > "0" )
resp$baseline <- rep(subset(respiratory, month == "0")$status, each=4)
# Numeric variants of factors
resp$centr.b <- as.integer(resp$centre) - 1
resp$treat.b <- as.integer(resp$treatment) - 1
resp$sex.b <- as.integer(resp$sex)-1
resp$pre.b <- as.integer(resp$baseline) - 1
resp$status.b <- as.integer(resp$status) - 1
resp$id <- as.integer(resp$subject)
# Pooled estimate
summary(m_glm.b <-glm(status.b ~ centr.b + treat.b + sex.b + pre.b + age, family="binomial", data=resp))
Call:
glm(formula = status.b ~ centr.b + treat.b + sex.b + pre.b +
age, family = "binomial", data = resp)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.3146 -0.8551 0.4336 0.8953 1.9246
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.900171 0.337653 -2.666 0.00768 **
centr.b 0.671601 0.239567 2.803 0.00506 **
treat.b 1.299216 0.236841 5.486 4.12e-08 ***
sex.b 0.119244 0.294671 0.405 0.68572
pre.b 1.882029 0.241290 7.800 6.20e-15 ***
age -0.018166 0.008864 -2.049 0.04043 *
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 608.93 on 443 degrees of freedom
Residual deviance: 483.22 on 438 degrees of freedom
AIC: 495.22
Number of Fisher Scoring iterations: 4
# glmer library(lme4) print(m_glmer_4.L <- glmer(status ~ centre + treatment + sex + baseline + age + (1|subject),
+ family=binomial,data=resp), cor=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: status ~ centre + treatment + sex + baseline + age + (1 | subject)
Data: resp
AIC BIC logLik deviance
443 471.7 -214.5 429
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 3.8647 1.9659
Number of obs: 444, groups: subject, 111
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.64438 0.75829 -2.169 0.0301 *
centre2 1.04382 0.53193 1.962 0.0497 *
treatmenttreatment 2.15746 0.51757 4.168 3.07e-05 ***
sexmale 0.20194 0.66117 0.305 0.7600
baselinegood 3.06990 0.52608 5.835 5.37e-09 ***
age -0.02540 0.01998 -1.271 0.2037
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
# Pooled and clustered library(sabreR) attach(resp) m_sabreR <- sabre(status.b ~ centr.b + treat.b + sex.b + pre.b + age, case=id) print(m_sabreR)
# ... deleted some output
(Standard Homogenous Model)
Parameter Estimate Std. Err. Z-score
____________________________________________________________________
(intercept) -0.90017 0.33765 -2.6660
centr.b 0.67160 0.23957 2.8034
treat.b 1.2992 0.23684 5.4856
sex.b 0.11924 0.29467 0.40467
pre.b 1.8820 0.24129 7.7999
age -0.18166E-01 0.88644E-02 -2.0493
(Random Effects Model)
Parameter Estimate Std. Err. Z-score
____________________________________________________________________
(intercept) -1.6642 0.84652 -1.9660
centr.b 0.99044 0.56561 1.7511
treat.b 2.1265 0.57198 3.7177
sex.b 0.18166 0.70814 0.25653
pre.b 2.9987 0.60174 4.9834
age -0.22949E-01 0.21337E-01 -1.0755
scale 1.9955 0.32093 6.2180
# ... deleted some output
detach()
On Mon, Oct 31, 2011 at 2:10 AM, Ben Bolker <bbolker at gmail.com> wrote:
On 11-10-31 05:22 AM, Reinhold Kliegl wrote:
One problem appears to be that 111 id's are renumbered from 1 to 55 (56) in the two groups. Unfortunately, it also appears that there is no unique mapping to treatment groups. So there are some subjects with 8 values assigned to one of the groups.
?Thanks. ?It looks like IDs are nested within center (not within treatment). ?That doesn't seem to change the story very much (as far as , though (Zhang et al don't report estimated random-effect variances ...)
library(geepack) data(respiratory) resp1 <- respiratory resp1 <- transform(resp1,
+ ? ? ? ? ? ? ? ? ? ? ? ? ?center=factor(center), + ? ? ? ? ? ? ? ? ? ? ? ? ?id=factor(id))
str(resp1)
'data.frame': 444 obs. of ?8 variables: ?$ center ?: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... ?$ id ? ? ?: Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ... ?$ treat ? : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ... ?$ sex ? ? : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ... ?$ age ? ? : int ?46 46 46 46 28 28 28 28 23 23 ... ?$ baseline: int ?0 0 0 0 0 0 0 0 1 1 ... ?$ visit ? : int ?1 2 3 4 1 2 3 4 1 2 ... ?$ outcome : int ?0 0 0 0 0 0 0 0 1 1 ...
detach("package:geepack") ## allow detaching of doBy
detach("package:doBy") ? ?## allow detaching of lme4
The data appear also in the HSAUR package, here the 111 subjects identified with 5 months (visits) each. ?I suspect month 0 was used as baseline.
library(HSAUR) data(respiratory) resp2 <- respiratory str(resp2)
'data.frame': 555 obs. of ?7 variables: ?$ centre ? : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... ?$ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1 1 1 1 1 1 ... ?$ sex ? ? ?: Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1 ... ?$ age ? ? ?: num ?46 46 46 46 46 28 28 28 28 28 ... ?$ status ? : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1 1 ... ?$ month ? ?: Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4 5 1 2 3 4 5 ... ?$ subject ?: Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 2 ... Reinhold On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com> wrote:
?There's a fairly recent paper by Zhang et al (2011) of interest to folks on this list DOI: 10.1002/sim.4265 ?In response to a post on the AD Model Builder users' list, I took a quick shot at re-doing some of their results (they have extensive simulation results, which I haven't tried to replicate yet, and an analysis of binary data from Davis (1991) which is included (I *think* it's the same data set -- the description and size of the data set match exactly) in the geepack data set). ?If anyone's interested, my results so far are posted at http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf ?So far the R approaches I've tried agree closely with each other and with glmmADMB (except MASS::glmmPQL, which I expected to be different -- the rest all use either Laplace approx. or AGHQ). ?They *don't* agree with the results Zhang et al got, yet -- I'm sure there's something I'm missing in the contrasts or otherwise ... ?Suggestions or improvements are welcome. ?cheers ? ?Ben Bolker
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20111031/f6d5abcb/attachment.pl>
The reference is given on the last page of the pdf from the original post. Zhang, H., N. Lu, C. Feng, S.W. Thurston, Y. Xia, L. Zhu, and X. M. Tu (2011). On fitting generalized linear mixed-effects models for binary responses using different statistical packages. Statistics in Medicine.
On Mon, Oct 31, 2011 at 7:49 PM, Saang-Yoon Hyun <shyunuw at gmail.com> wrote:
Hi, Could you show the reference: e.g., authors, year, tilte, journal, pages, etc.? ? If you could share its PDF version of the paper, it would be more helpful. Thank you, Saang-Yoon On Mon, Oct 31, 2011 at 12:15 PM, Reinhold Kliegl <reinhold.kliegl at gmail.com
wrote:
Here is some comparison between glm, ?glmer (using lme4 with Laplace) and sabreR (which uses AGHQ, if I recall correctly). The glm analysis replicates exactly the results reported in Everitt and Hothorn (2002, Table 13.1, around page 175). ?Thus, I am pretty sure I am using the correct data. ?sabreR gives estimates both for the standard homogeneous model, replicating the glm(), ?as well as a random effects model, replicating glmer() pretty closely, I think Reinhold
# HSAUR contains respiratory data
data("respiratory", package = "HSAUR")
# Convert pretest to covariate "baseline"
resp <- subset(respiratory, month > "0" )
resp$baseline <- rep(subset(respiratory, month == "0")$status, each=4)
# Numeric variants of factors
resp$centr.b <- as.integer(resp$centre) - 1
resp$treat.b <- as.integer(resp$treatment) - 1
resp$sex.b <- as.integer(resp$sex)-1
resp$pre.b <- as.integer(resp$baseline) - 1
resp$status.b <- as.integer(resp$status) - 1
resp$id <- as.integer(resp$subject)
# Pooled estimate
summary(m_glm.b <-glm(status.b ~ centr.b + treat.b + sex.b + pre.b +
age, family="binomial", data=resp)) Call: glm(formula = status.b ~ centr.b + treat.b + sex.b + pre.b + ? ?age, family = "binomial", data = resp) Deviance Residuals: ? ?Min ? ? ? 1Q ? Median ? ? ? 3Q ? ? ?Max -2.3146 ?-0.8551 ? 0.4336 ? 0.8953 ? 1.9246 Coefficients: ? ? ? ? ? ? Estimate Std. Error z value Pr(>|z|) (Intercept) -0.900171 ? 0.337653 ?-2.666 ?0.00768 ** centr.b ? ? ?0.671601 ? 0.239567 ? 2.803 ?0.00506 ** treat.b ? ? ?1.299216 ? 0.236841 ? 5.486 4.12e-08 *** sex.b ? ? ? ?0.119244 ? 0.294671 ? 0.405 ?0.68572 pre.b ? ? ? ?1.882029 ? 0.241290 ? 7.800 6.20e-15 *** age ? ? ? ? -0.018166 ? 0.008864 ?-2.049 ?0.04043 * --- Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for binomial family taken to be 1) ? ?Null deviance: 608.93 ?on 443 ?degrees of freedom Residual deviance: 483.22 ?on 438 ?degrees of freedom AIC: 495.22 Number of Fisher Scoring iterations: 4
# glmer library(lme4) print(m_glmer_4.L <- glmer(status ~ centre + treatment + sex + baseline
+ age + (1|subject), + ? ? ? family=binomial,data=resp), cor=FALSE) Generalized linear mixed model fit by the Laplace approximation Formula: status ~ centre + treatment + sex + baseline + age + (1 | subject) ? Data: resp ?AIC ? BIC logLik deviance ?443 471.7 -214.5 ? ? ?429 Random effects: ?Groups ?Name ? ? ? ?Variance Std.Dev. ?subject (Intercept) 3.8647 1.9659 Number of obs: 444, groups: subject, 111 Fixed effects: ? ? ? ? ? ? ? ? ? Estimate Std. Error z value Pr(>|z|) (Intercept) ? ? ? ?-1.64438 ? ?0.75829 ?-2.169 ? 0.0301 * centre2 ? ? ? ? ? ? 1.04382 ? ?0.53193 ? 1.962 ? 0.0497 * treatmenttreatment ?2.15746 ? ?0.51757 ? 4.168 3.07e-05 *** sexmale ? ? ? ? ? ? 0.20194 ? ?0.66117 ? 0.305 ? 0.7600 baselinegood ? ? ? ?3.06990 ? ?0.52608 ? 5.835 5.37e-09 *** age ? ? ? ? ? ? ? ?-0.02540 ? ?0.01998 ?-1.271 ? 0.2037 --- Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
# Pooled and clustered library(sabreR) attach(resp) m_sabreR <- sabre(status.b ~ centr.b + treat.b + sex.b + pre.b + age,
case=id)
print(m_sabreR)
# ... deleted some output (Standard Homogenous Model) ? ?Parameter ? ? ? ? ? ? ?Estimate ? ? ? ? Std. Err. ? ? ? ?Z-score ? ?____________________________________________________________________ ? ?(intercept) ? ? ? ? ? -0.90017 ? ? ? ? ?0.33765 ? ? ? ? ?-2.6660 ? ?centr.b ? ? ? ? ? ? ? ?0.67160 ? ? ? ? ?0.23957 ? ? ? ? ? 2.8034 ? ?treat.b ? ? ? ? ? ? ? ? 1.2992 ? ? ? ? ?0.23684 ? ? ? ? ? 5.4856 ? ?sex.b ? ? ? ? ? ? ? ? ?0.11924 ? ? ? ? ?0.29467 ? ? ? ? ?0.40467 ? ?pre.b ? ? ? ? ? ? ? ? ? 1.8820 ? ? ? ? ?0.24129 ? ? ? ? ? 7.7999 ? ?age ? ? ? ? ? ? ? ? ? -0.18166E-01 ? ? ?0.88644E-02 ? ? ?-2.0493 (Random Effects Model) ? ?Parameter ? ? ? ? ? ? ?Estimate ? ? ? ? Std. Err. ? ? ? ?Z-score ? ?____________________________________________________________________ ? ?(intercept) ? ? ? ? ? ?-1.6642 ? ? ? ? ?0.84652 ? ? ? ? ?-1.9660 ? ?centr.b ? ? ? ? ? ? ? ?0.99044 ? ? ? ? ?0.56561 ? ? ? ? ? 1.7511 ? ?treat.b ? ? ? ? ? ? ? ? 2.1265 ? ? ? ? ?0.57198 ? ? ? ? ? 3.7177 ? ?sex.b ? ? ? ? ? ? ? ? ?0.18166 ? ? ? ? ?0.70814 ? ? ? ? ?0.25653 ? ?pre.b ? ? ? ? ? ? ? ? ? 2.9987 ? ? ? ? ?0.60174 ? ? ? ? ? 4.9834 ? ?age ? ? ? ? ? ? ? ? ? -0.22949E-01 ? ? ?0.21337E-01 ? ? ?-1.0755 ? ?scale ? ? ? ? ? ? ? ? ? 1.9955 ? ? ? ? ?0.32093 ? ? ? ? ? 6.2180 # ... deleted some output
detach()
On Mon, Oct 31, 2011 at 2:10 AM, Ben Bolker <bbolker at gmail.com> wrote:
On 11-10-31 05:22 AM, Reinhold Kliegl wrote:
One problem appears to be that 111 id's are renumbered from 1 to 55 (56) in the two groups. Unfortunately, it also appears that there is no unique mapping to treatment groups. So there are some subjects with 8 values assigned to one of the groups.
?Thanks. ?It looks like IDs are nested within center (not within treatment). ?That doesn't seem to change the story very much (as far as , though (Zhang et al don't report estimated random-effect variances ...)
library(geepack) data(respiratory) resp1 <- respiratory resp1 <- transform(resp1,
+ ? ? ? ? ? ? ? ? ? ? ? ? ?center=factor(center), + ? ? ? ? ? ? ? ? ? ? ? ? ?id=factor(id))
str(resp1)
'data.frame': 444 obs. of ?8 variables: ?$ center ?: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... ?$ id ? ? ?: Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3
3 ...
?$ treat ? : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ... ?$ sex ? ? : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ... ?$ age ? ? : int ?46 46 46 46 28 28 28 28 23 23 ... ?$ baseline: int ?0 0 0 0 0 0 0 0 1 1 ... ?$ visit ? : int ?1 2 3 4 1 2 3 4 1 2 ... ?$ outcome : int ?0 0 0 0 0 0 0 0 1 1 ...
detach("package:geepack") ## allow detaching of doBy
detach("package:doBy") ? ?## allow detaching of lme4
The data appear also in the HSAUR package, here the 111 subjects identified with 5 months (visits) each. ?I suspect month 0 was used as baseline.
library(HSAUR) data(respiratory) resp2 <- respiratory str(resp2)
'data.frame': 555 obs. of ?7 variables: ?$ centre ? : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... ?$ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1 1 1 1
1 1 ...
?$ sex ? ? ?: Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1
...
?$ age ? ? ?: num ?46 46 46 46 46 28 28 28 28 28 ... ?$ status ? : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1 1 ... ?$ month ? ?: Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4 5 1 2
3 4 5 ...
?$ subject ?: Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2
2 2 ...
Reinhold On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com> wrote:
?There's a fairly recent paper by Zhang et al (2011) of interest to folks on this list DOI: 10.1002/sim.4265 ?In response to a post on the AD Model Builder users' list, I took a quick shot at re-doing some of their results (they have extensive simulation results, which I haven't tried to replicate yet, and an analysis of binary data from Davis (1991) which is included (I *think* it's the same data set -- the description and size of the data set
match
exactly) in the geepack data set). ?If anyone's interested, my results so far are posted at http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf ?So far the R approaches I've tried agree closely with each other and with glmmADMB (except MASS::glmmPQL, which I expected to be different
--
the rest all use either Laplace approx. or AGHQ). ?They *don't* agree with the results Zhang et al got, yet -- I'm sure there's something I'm missing in the contrasts or otherwise ... ?Suggestions or improvements are welcome. ?cheers ? ?Ben Bolker
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
? ? ? ?[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models