Skip to content

lmer() with 'na.action=na.exclude'; error with summary()

5 messages · John Maindonald, Martin Maechler, Ben Bolker

#
The following demonstrates the issue:
+                      (1 | school:class), data = science,  
+                      na.action=na.exclude)
Linear mixed model fit by REML ['lmerMod']
Formula: like ~ sex + PrivPub + (1 | school) + (1 | school:class)
   Data: science

REML criterion at convergence: 5546.5

Scaled residuals: 
Error in quantile.default(resids) : 
  missing values and NaN's not allowed if 'na.rm' is FALSE
Linear mixed model fit by REML ['lmerMod']
Formula: like ~ sex + PrivPub + (1 | school) + (1 | school:class)
   Data: science

REML criterion at convergence: 5546.5

Scaled residuals: 
Error in quantile.default(resids) : 
  missing values and NaN's not allowed if 'na.rm' is FALSE
With na.action = na.omit, there is no problem.
R version 3.1.1 Patched (2014-08-22 r66458)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lme4_1.1-7      Rcpp_0.11.2     Matrix_1.1-4    DAAG_1.20       lattice_0.20-29

loaded via a namespace (and not attached):
[1] grid_3.1.1          latticeExtra_0.6-26 MASS_7.3-34         minqa_1.2.3         nlme_3.1-117       
[6] nloptr_1.0.4        RColorBrewer_1.0-5  splines_3.1.1       tools_3.1.1    

I doubt that there is an intention for summary.merMod() to throw an error
lmer() has been called with 'na.action=na.exclude?.  It should certainly not 
throw an error with the argument 'show.resids=FALSE?.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
#
John Maindonald <john.maindonald at ...> writes:
[snip]
This is a confusion between the arguments of the summary method
(summary.merMod) and the *print method* (print.summary.merMod).
We should add a warning to summary that says it is discarding
unused arguments (the ... in the model definition is only there
for compatibility with the summary() generic method).

  print(summary(science.lmer), show.resids=FALSE)

works fine.
Now I'm wondering whether the correct behaviour when there are
NAs in the (extended) residuals is 

  1 omit NAs from the residual quantile calculation (easiest)
  2 return NA values for all quantiles of the residual
  3 return the quantiles plus a statement of the number of NAs.

Any good reason not to just do #1?
#
On 9 Sep 2014, at 2:50, Ben Bolker <bbolker at gmail.com> wrote:

            
That is a subtlety that I had not contemplated.  It is not in forming
the summary object, but in printing it that the problem is evident.
I see no reason not to choose (1),  At the time we prepared the 3rd edition of 
'Data Analysis and Graphics using R?, this was the behaviour. My understanding 
of ?na.action=na.exclude? has been that the result is as for ?na.action=na.omit?, 
except that residuals match up with the original observations, with NAs inserted
as necessary.  In other words, the change from  ?na.action=na.omit? to
?na.action=na.exclude? is all about wharf appears when one explicitly
calculates and returns residuals, not about what happens when summary 
information (quantiles, not the residuals themselves!) is printed.

This strategy also simplifies documenting and describing what is done.
John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
#
> On 9 Sep 2014, at 2:50, Ben Bolker <bbolker at gmail.com>
> wrote:
>> John Maindonald <john.maindonald at ...> writes:
    >> 
    >>> 
    >>> The following demonstrates the issue:
    >>> 
    >>>> library(DAAG) science.lmer <- lmer(like ~ sex + PrivPub
    >>>> + (1 | school) +
    >>> + (1 | school:class), data = science, +
    >>> na.action=na.exclude)
    >>>> summary(science.lmer)
    >>> Linear mixed model fit by REML ['lmerMod']
    >> 
    >> [snip]
    >> 
    >>> Scaled residuals: Error in quantile.default(resids) :
    >>> missing values and NaN's not allowed if 'na.rm' is FALSE
    >>>> ## Suppress details of residuals
    >> 
    >> 
    >>>> summary(science.lmer, show.resids=FALSE)
    >> 
    >> This is a confusion between the arguments of the summary
    >> method (summary.merMod) and the *print method*
    >> (print.summary.merMod).  We should add a warning to
    >> summary that says it is discarding unused arguments (the
    >> ... in the model definition is only there for
    >> compatibility with the summary() generic method).
    >> 
    >> print(summary(science.lmer), show.resids=FALSE)
    >> 
    >> works fine.

    > That is a subtlety that I had not contemplated.  It is not
    > in forming the summary object, but in printing it that the
    > problem is evident.

    >> I doubt that there is an intention for summary.merMod()
    >> to throw an error
    >>> lmer() has been called with 'na.action=na.exclude?.  It
    >>> should certainly not throw an error with the argument
    >>> 'show.resids=FALSE?.
    >> 
    >> Now I'm wondering whether the correct behaviour when
    >> there are NAs in the (extended) residuals is
    >> 
    >> 1 omit NAs from the residual quantile calculation
    >> (easiest)

       > I see no reason not to choose (1), At the time we prepared
       > the 3rd edition of 'Data Analysis and Graphics using R?,
       > this was the behaviour. My understanding of
       > ?na.action=na.exclude? has been that the result is as for
       > ?na.action=na.omit?, except that residuals match up with
       > the original observations, with NAs inserted as necessary.
       > In other words, the change from ?na.action=na.omit? to
       > ?na.action=na.exclude? is all about wharf appears when one
       > explicitly calculates and returns residuals, not about
       > what happens when summary information (quantiles, not the
       > residuals themselves!) is printed.

       > This strategy also simplifies documenting and describing
       > what is done.

    >> 2 return NA values for all quantiles of the residual 

    >> 3 return the quantiles plus a statement of the number of NAs.
    >> 
    >> Any good reason not to just do #1?

We should  follow the behavior of lm()
in all such cases of doubt unless we have very very strong
reasons not to follow lm().
It has been *the* example and motivator to a big extent AFAIR
for the  na.action  semantics when they were introduced (in S).

Martin

    > John Maindonald email: john.maindonald at anu.edu.au phone :
    > +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for
    > Mathematics & Its Applications, Room 1194, John Dedman
    > Mathematical Sciences Building (Building 27) Australian
    > National University, Canberra ACT 0200.
#
On 14-09-09 10:02 AM, Martin Maechler wrote:
I just checked, and it definitely na.omits (it pulls out the
'residuals' component from the summary object).

  I've already pushed a fix along these lines (different implementation,
but same behaviour) to Github, although I haven't pushed tests or
updated the NEWS file ...

  Ben