r-sig mixed models mailing list response - R-SIG-mixed-models

Ben Bolker · 2013-01-21T14:24:52Z

On 13-01-20 10:22 PM, Belinda Burns wrote: > Dear Ben, > [cc'ing to r-sig-mixed-models] > Thank you for your response to my question to the mixed models mailing > list (see below). I wasn't sure if I could just reply directly to the > mailing list request email so I thought I'd email you directly. > Your comment on the heteroscedasticity makes some sense to me - however, > a log transformation (of proportions + 1) does little to improve the > distribution of the data. OK. > I have though

Ben Bolker

Mon, Jan 21, 2013 6:24 AM #

On 13-01-20 10:22 PM, Belinda Burns wrote:

[cc'ing to r-sig-mixed-models]

OK.

It might be a good idea to look at the books by Zuur and co-authors,
which are fairly ecologist-friendly. While I don't agree with everything
in them, they generally exhibit common sense and give good general
descriptions.

Transformation of this sort doesn't make sense.  NB models
must be run on *count* data.  It sounds like your data act more
like a Beta distribution (used to model proportions), since
"number of seconds" isn't really that likely to act like a count.

Note that the random effects are extremely small, suggesting that
the amount of noise at the observation level (within the individual
counts) is big enough to swamp any observable effects of Group or
Individual within group.

  I would consider trying this with family="beta" and zero-inflation,
although admittedly that's a combination I have never tested.

Make sure to plot your data and make sure that the estimates
make sense!

Negative binomial dispersion parameter: 1.0235 (std. err.: 0.15743)
Zero-inflation: 0.48117  (std. err.:  0.03323 )

Log-likelihood: -1089.92


Thank you for reading,

Belinda Burns

-------------------------------------------------------------------------------------------------------------------------------------------------

Belinda Burns <10517197 at ...> writes:

Dear all,

I hope this is the correct place for my question, if not, my apologies! I
am analysing several behaviour variables obtained by observing captive
gibbons. The raw values are in the form proportion of ten minutes spent
doing the behaviour, and most of the behaviours are zero-inflated and
negatively skewed.

At the moment I am interested in modelling the proportion of time that
adult gibbons spend grooming their mates, such that the models take the
form:

Grooms_mate~Age+Species*Sex+

Family_composition+Repro_phase*Sex

where species is a factor with 3 levels, family composition is a binary
variable (they either have offspring or not) and repro_phase is the
reproductive phase of the female (4 levels).

Ideally I should be including individual and group as random effects
(individuals are nested within groups) and so I would like to use a
mixed model approach; however, diagnostic plots of residuals vs
fitted values show heteroscedasticity (increasing spread with
increasing fitted values) and plots of residuals vs predictors
suggests that one species is less variable than the other two and
gibbons with offspring are more variable than those without. The
inclusion of a species*family_composition weighted variance function
(using the weights= varIdent(form~1|Species*Family_composition) in a
gls model) seems to improve the homogeneity of the residuals...

I therefore have two questions (among a million others!): Can I
include the two random effects in gls, or, vice versa, a varIdent
structure in lmer? (the only contact I know doing mixed modelling in
R uses lmer with MCMC estimation of p-values and so I am most
comfortable using that to include the random effects) How do I write
individual and group in as random effects considering individual is
nested in group?

lmer does not handle "R-side" effects (heteroscedasticity/varStruct/etc.)
at present.  You should be able to use random=~1|group/individual
in lme to account for individuals nested in groups.  However,
heteroscedasticity is also a common feature of lognormal data: could
you get away with some transformation of the form
log(small_number+proportion)
(realizing that picking small_number is a bit of a can of worms)?
Or plogis(small_number+proportion)? (Should be roughly equivalent if
the proportions are typically small.)

  Ben Bolker


-- 
Belinda Lee Burns, BSc(Hons)
PhD student
Email: burnsb02 at student.uwa.edu.au <mailto:burnsb02 at student.uwa.edu.au>