lme4, cloglog vs. binomial link

David Duffy · 2012-06-04T23:32:07Z

On Mon, 4 Jun 2012, Tibor Kiss wrote: > In the following mixed models, *target_noun_lemma* is the representation > of the noun in the construction, its categorical value being one of the > 712 different nouns in the sample. The sample contains 6.841 different > instances of the construction: 810 instances of determiner omission and > 6.031 instances of determiner realization. > > The distribution of *target_noun_lemma* is highly skewed (which is > standard for language samples): the top f

David Duffy

Mon, Jun 4, 2012 4:32 PM

On Mon, 4 Jun 2012, Tibor Kiss wrote:

I found all this quite dizzying.  I would first look for an optimal link 
function in a fixed effect GLM for a dataset of your top 5 nouns. I don't 
think you can read much into the scale of the random effects estimates 
using different link functions.  The other way of doing these things is 
changing the distribution of the random effects - for a single random 
effect like this there are nonparametric/mixture models (you could 
interpret this as clustering your nouns into families).

Interpretation of the AICs depends on the internals of the loglik 
for the different links.  They should be comparable, in which case 
logit good, cloglog bad.

You can sometimes get rid of a random effect completely by transformation. 
The examples I know of are for continuous Y and crossed factors (additive 
and dominant genetic variances), where one factor can be removed.

Cheers, David Duffy.

lme4, cloglog vs. binomial link

Thread (3 messages)