Skip to content

na.action = na.augment for random effects in lme4?

3 messages · Andrew Robinson, Phillip Alday, Ben Bolker

#
Hi all,

I'm interested in fitting and applying models for which the data to which I apply the model will have some observations with random effects levels that are not in the fitting dataset.  I would like to flag these observations in some way.

Naively, I would prefer to have something like the na.action = na.augment argument so that predictions for observations with previously unseen levels of random effects would simply be missing.  Is there such a capability that I've missed?

Warm wishes,

Andrew


--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: apro at unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro at unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects to their Elders.
#
Doesn't look like it, the documentation for predict.merMod has option:
Maybe packages adding some extra functionality like merTools have some
things for this.


Otherwise, you can just filter your newdata with something like

newdata[newdata$groupingvar %in% levels(olddata$groupingvar), ]

Phillip
On 11/10/2020 23:42, Andrew Robinson wrote:
#
On 10/11/20 6:05 PM, Phillip Alday wrote:
Yes. Following up:

* do you mean na.exclude (rather than na.augment)?

* it would certainly make sense that you might want these cases to be NA 
rather than predicted at the population level.  In hindsight it might 
have been a good idea to set this up as new.re.levels allowing the 
options c("population","fail", "exclude", "omit").

   Honestly, sorting out and implementing appropriate behaviours for a 
possible combinations of NAs in covariates or grouping variables of the 
initial data set and in the prediction data set has always given me a 
headache ...

    I would say you should do

  newresp <- predict(fitted_model, newdata, allow.new.levels=TRUE)
  new_levels <- !newdata$groupingvar %in% levels(orig_data$groupingvar)
  newresp[new_levels] <- NA