Skip to content

Zhang 2011 (re)analysis

7 messages · Ben Bolker, Reinhold Kliegl, Saang-Yoon Hyun +1 more

#
There's a fairly recent paper by Zhang et al (2011) of interest to
folks on this list

DOI: 10.1002/sim.4265

  In response to a post on the AD Model Builder users' list, I took a
quick shot at re-doing some of their results (they have extensive
simulation results, which I haven't tried to replicate yet, and an
analysis of binary data from Davis (1991) which is included (I *think*
it's the same data set -- the description and size of the data set match
exactly) in the geepack data set).

  If anyone's interested, my results so far are posted at

http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw
http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf

  So far the R approaches I've tried agree closely with each other and
with glmmADMB (except MASS::glmmPQL, which I expected to be different --
the rest all use either Laplace approx. or AGHQ).  They *don't* agree
with the results Zhang et al got, yet -- I'm sure there's something I'm
missing in the contrasts or otherwise ...

  Suggestions or improvements are welcome.

  cheers
    Ben Bolker
#
On 11-10-31 05:22 AM, Reinhold Kliegl wrote:
Thanks.  It looks like IDs are nested within center (not within
treatment).  That doesn't seem to change the story very much (as far as
, though (Zhang et al don't report estimated random-effect variances ...)
#
On 11-10-31 07:49 PM, Saang-Yoon Hyun wrote:
DOI: 10.1002/sim.4265
On fitting generalized linear mixed-effects models for binary responses
using different statistical packages
Hui Zhang, Naiji Lu, Changyong Feng, Sally W. Thurston,
Yinglin Xia, Liang Zhua and Xin M. Tu

  Statistics in Medicine.  Apparently no page numbers (online publication?)
#
One problem appears to be that 111 id's are renumbered from 1 to 55
(56) in the two groups.
Unfortunately, it also appears that there is no unique mapping to
treatment groups. So there are some subjects with 8 values assigned to
one of the groups.
+                          center=factor(center),
+                          id=factor(id))
'data.frame':	444 obs. of  8 variables:
 $ center  : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
 $ id      : Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
 $ treat   : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ...
 $ sex     : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
 $ age     : int  46 46 46 46 28 28 28 28 23 23 ...
 $ baseline: int  0 0 0 0 0 0 0 0 1 1 ...
 $ visit   : int  1 2 3 4 1 2 3 4 1 2 ...
 $ outcome : int  0 0 0 0 0 0 0 0 1 1 ...
The data appear also in the HSAUR package, here the 111 subjects
identified with 5 months (visits) each.  I suspect month 0 was used as
baseline.
'data.frame':	555 obs. of  7 variables:
 $ centre   : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
 $ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1 1 1 1 1 1 ...
 $ sex      : Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1 ...
 $ age      : num  46 46 46 46 46 28 28 28 28 28 ...
 $ status   : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1 1 ...
 $ month    : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4 5 1 2 3 4 5 ...
 $ subject  : Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 2 ...

Reinhold
On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com> wrote:
#
Here is some comparison between glm,  glmer (using lme4 with Laplace)
and sabreR (which uses AGHQ, if I recall correctly).
The glm analysis replicates exactly the results reported in Everitt
and Hothorn (2002, Table 13.1, around page 175).  Thus, I am pretty
sure I am using the correct data.

 sabreR gives estimates both for the standard homogeneous model,
replicating the glm(),  as well as a random effects model, replicating
glmer() pretty closely, I think

Reinhold
Call:
glm(formula = status.b ~ centr.b + treat.b + sex.b + pre.b +
    age, family = "binomial", data = resp)

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-2.3146  -0.8551   0.4336   0.8953   1.9246

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.900171   0.337653  -2.666  0.00768 **
centr.b      0.671601   0.239567   2.803  0.00506 **
treat.b      1.299216   0.236841   5.486 4.12e-08 ***
sex.b        0.119244   0.294671   0.405  0.68572
pre.b        1.882029   0.241290   7.800 6.20e-15 ***
age         -0.018166   0.008864  -2.049  0.04043 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 608.93  on 443  degrees of freedom
Residual deviance: 483.22  on 438  degrees of freedom
AIC: 495.22

Number of Fisher Scoring iterations: 4
+       family=binomial,data=resp), cor=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: status ~ centre + treatment + sex + baseline + age + (1 | subject)
   Data: resp
 AIC   BIC logLik deviance
 443 471.7 -214.5      429
Random effects:
 Groups  Name        Variance Std.Dev.
 subject (Intercept) 3.8647   1.9659
Number of obs: 444, groups: subject, 111

Fixed effects:
                   Estimate Std. Error z value Pr(>|z|)
(Intercept)        -1.64438    0.75829  -2.169   0.0301 *
centre2             1.04382    0.53193   1.962   0.0497 *
treatmenttreatment  2.15746    0.51757   4.168 3.07e-05 ***
sexmale             0.20194    0.66117   0.305   0.7600
baselinegood        3.06990    0.52608   5.835 5.37e-09 ***
age                -0.02540    0.01998  -1.271   0.2037
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
# ... deleted some output
(Standard Homogenous Model)
    Parameter              Estimate         Std. Err.        Z-score
    ____________________________________________________________________
    (intercept)           -0.90017          0.33765          -2.6660
    centr.b                0.67160          0.23957           2.8034
    treat.b                 1.2992          0.23684           5.4856
    sex.b                  0.11924          0.29467          0.40467
    pre.b                   1.8820          0.24129           7.7999
    age                   -0.18166E-01      0.88644E-02      -2.0493



(Random Effects Model)
    Parameter              Estimate         Std. Err.        Z-score
    ____________________________________________________________________
    (intercept)            -1.6642          0.84652          -1.9660
    centr.b                0.99044          0.56561           1.7511
    treat.b                 2.1265          0.57198           3.7177
    sex.b                  0.18166          0.70814          0.25653
    pre.b                   2.9987          0.60174           4.9834
    age                   -0.22949E-01      0.21337E-01      -1.0755
    scale                   1.9955          0.32093           6.2180

# ... deleted some output

        
On Mon, Oct 31, 2011 at 2:10 AM, Ben Bolker <bbolker at gmail.com> wrote:
#
The reference is given on the last page of the pdf from the original post.

Zhang, H., N. Lu, C. Feng, S.W. Thurston, Y. Xia, L. Zhu, and X. M. Tu (2011).
On fitting generalized linear mixed-effects models for binary responses using
different statistical packages. Statistics in Medicine.
On Mon, Oct 31, 2011 at 7:49 PM, Saang-Yoon Hyun <shyunuw at gmail.com> wrote: