-----Original Message-----
From: Cameron Gillies [mailto:cgillies at ualberta.ca]
Sent: Sunday, December 03, 2006 6:31 PM
To: Prof Brian Ripley; John Fox
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] lmer and a response that is a proportion
Dear Brian and John,
Thanks for your insight. I'll clarify a couple of things
incase it changes your advice.
My response is a ratio of two measures taken during a bird's
path, which varies from 0 to 1, so I cannot convert it
columns of the number of successes. It has to be reported as
the proportion. I could logit transform it to make it
normal, but I am trying to avoid that so I can analyze it directly.
The subjects are individual birds and I have a range of
sample sizes from each bird (from 8 to >200, average of about
75 measurements/bird).
Thanks!
Cam
On 12/3/06 3:47 PM, "Prof Brian Ripley" <ripley at stats.ox.ac.uk> wrote:
On Sun, 3 Dec 2006, John Fox wrote:
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Cameron
Gillies
Sent: Sunday, December 03, 2006 1:58 PM
To: r-help at stat.math.ethz.ch
Subject: [R] lmer and a response that is a proportion
Greetings all,
I am using lmer (lme4 package) to analyze data where the
a proportion (0 to 1). It appears to work, but I am wondering if
the analysis is treating the response appropriately -
As far as I know, you can specify the response as a proportion, in
which case the binomial counts would be given via the weights
argument -- at least that's how it's done in glm(). An alternative
that should be equivalent is to specify a two-column matrix with
counts of "successes" and "failures" as the response.
the proportion of successes without the counts wouldn't be
I have used both family=binomial and quasibinomial - is one more
appropriate when the response is a proportion? The coefficient
estimates are identical, but the standard errors are larger with
family=binomial.
The difference is that in the binomial family the
to 1, while in the quasibinomial family it is estimated as a free
parameter. If the standard errors are larger with family=binomial,
then that suggests that the data are underdispersed
binomial); if the difference is substantial -- the factor
square root of the estimated dispersion -- then the
probably not appropriate for the data.
John's last deduction is appropriate to a GLM, but not
a GLMM. I don't have detailed experience with lmer for
do for various other fitting routines for GLMM. Remember
least two sources of randomness in a GLMM, and let us keep
and have just a subject effect and a measurement error. Then if
over-dispersion is happening within subjects, forcing the binomial
dispersion (at the measurement level) to 1 tends to increase the
estimate of the subject-level variance component to
turn increase some of the standard errors.
(Please note the 'tends' in that para, as the details of
matter. For cognescenti, think about plot and sub-plot