Skip to content

Grouped vs ungrouped binary data

1 message · Giovanni Petris

#
Hello all,

I am trying to analyze binary matched data about approval rates of a
prime minister expressed by the same subjects on two different surveys
six months apart (example and data coming from Agresti, Sec. 12.1.3, p.
494). Following Agresti, and common sense, I want to fit the model

            Approval ~ survey + (1 | id)

I tried to analyze the data in a grouped form and then in a an ungrouped
form, but I am getting different results. Furthermore, Agresti has ML
estimate that are different from what I get in either way. Could anybody
help me understanding what I am doing wrong or I am not getting?

Here I set up the data set and fit the model in grouped form, using the
weights argument to glmer:
+                    dimnames = list(First = c("Approve", "Disapprove"),
+                    Second = c("Approve", "Disapprove")))
First     Second Freq
1    Approve    Approve  794
2 Disapprove    Approve   86
3    Approve Disapprove  150
4 Disapprove Disapprove  570
+                          timevar = "survey",
+                          times = c("First", "Second"),
+                          v.names = "Approval",
+                          varying = c("First", "Second"),
+                          direction = "long")
Freq survey   Approval id
1.First   794  First    Approve  1
2.First    86  First Disapprove  2
3.First   150  First    Approve  3
4.First   570  First Disapprove  4
1.Second  794 Second    Approve  1
2.Second   86 Second    Approve  2
3.Second  150 Second Disapprove  3
4.Second  570 Second Disapprove  4
+                       data = approval.long, weights = Freq, nAGQ = 5))
Generalized linear mixed model fit by the adaptive Gaussian Hermite approximation 
Formula: Approval ~ survey + (1 | id) 
   Data: approval.long 
   AIC   BIC logLik deviance
 651.1 651.4 -322.6    645.1
Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 75.732   8.7024  
Number of obs: 8, groups: id, 4

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -0.6251     4.4607  -0.140    0.889    
surveySecond   1.1144     0.1911   5.831 5.53e-09 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

Correlation of Fixed Effects:
            (Intr)
surveySecnd -0.021
Warning message:
In model.matrix.default(mt, mf, contrasts) :
  variable 'survey' converted to a factor

Then I try to fit the model on subject specific data:
+                                       function(x) rep(x, approval.long$Freq)),
+                                id = rep(1 : sum(approval), 2))
+                                     labels = c("Approve", "Disapprove"))
+                       data = approval.subject, nAGQ = 5))
Generalized linear mixed model fit by the adaptive Gaussian Hermite approximation 
Formula: Approval ~ survey + (1 | id) 
   Data: approval.subject 
  AIC  BIC logLik deviance
 4362 4380  -2178     4356
Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 11.230   3.3511  
Number of obs: 3200, groups: id, 1600

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -1.0093     0.1155  -8.742  < 2e-16 ***
surveySecond   0.4169     0.1075   3.880 0.000104 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

Correlation of Fixed Effects:
            (Intr)
surveySecnd -0.476

As you can see, I get different estimates for both the fixed effects and
the variance of the random effect. For comparison, Agresti reports the
MLE of the fixed effect (survey) to be -0.556 and the variance of the
random effect to be (5.16)^2. 

Thank you in advance,
Giovanni Petris