Message-ID: <23448.57730.644941.84382@stat.math.ethz.ch>
Date: 2018-09-12T09:50:58Z
From: Martin Maechler
Subject: var() with 0-length vector -- docs inconsistent with result
In-Reply-To: <DM5P10602MB0089AD308AD4A2159BD015EA8A040@DM5P10602MB0089.NAMP106.PROD.OUTLOOK.COM>
>>>>> Raubertas, Richard via R-devel
>>>>> on Tue, 11 Sep 2018 18:52:55 +0000 writes:
> R 3.5.1 on Windows 7 The documentation for 'var' says:
> "These functions return 'NA' when there is only one
> observation (whereas S-PLUS has been returning 'NaN'), and
> fail if 'x' has length zero."
Well, that help says much more, notably the paragraph
immediately before the sentence you cite ends saying
Note that (the equivalent of) ?var(double(0), use = *)? gives ?NA?
for ?use = "everything"? and ?"na.or.complete"?, and gives an
error in the other cases.
which is true.
Thank you, Richard, for the report.
The current docs are indeed easily misleading here.
I think that just erasing the ending half-sentence
" , and fail if 'x' has length zero. "
should do.
> The function 'sd' (based on 'var') has similar documentation.
indeed... and "much worse", it says
The standard deviation of a zero-length vector (after removal of
?NA?s if ?na.rm = TRUE?) is not defined and gives an error.
I propose also just amend the docu there, and do not change
the code (as you Richard also seem favor).
After all, `NA` is also pretty close to "not defined", and in that sense valid.
Martin
> However, I get:
> > var(numeric(0))
> [1] NA
> rather than an error.
> Personally I prefer that basic summary functions like
> 'var' not throw errors even in corner cases. But either
> way, the result and the docs are inconsistent.
> Richard Raubertas