Skip to content

Specifying outcome variable in binomial glmm: single responses vs cbind?

3 messages · Ben Bolker, a y

a y
#
What is the difference between fitting a binomial glmm (without random item
effects) in the following two ways?

1.
Data formatted in the following way:

(data_long)
ID    correct    condition    itemID
1      1             A               i1
1      0             A               i2
1      1             A               i3
1      1             A               i4
2      0             B               i1
2      1             B               i2
2      1             B               i3
2      0             B               i4

Fitting a model without item random effects:

glmer(correct ~ condition + (1|ID), family = binomial, data = data_long)


2.
Data formatted this way (summing over the correct responses):

(data_short)
ID     sum_correct    condition     itemID
1       3                      A                NA
2       2                      B                NA

Fitting the following model, assuming there were only 4 items  (I've seen
dozens of examples like this):
glmer(cbind(sum_correct, 4 - sum_correct) ~ condition + (1|ID), family =
binomial, data = data_short)

---
I figured these models should be identical, but in my experience they are
very much not. What am I missing? When is the second (more) appropriate?

Thanks for any help,
Andrew
#
On 16-07-01 07:37 PM, a y wrote:
I believe they should give different likelihoods but identical
parameter estimates, *differences* among likelihoods (i.e. among
competing models fitted with the same data), etc..  That is,
disaggregating the data leads to an extra additive constant in the
log-likelihood. I would be very interested to see a counter-example to
that statement!  In general, the second form should be quicker to fit,
provide residuals that are easier to interpret, etc..
a y
#
I answered my own question, so feel free to disregard this topic.
On Fri, Jul 1, 2016 at 6:37 PM, a y <beermewi at gmail.com> wrote: