Skip to content
Prev 11618 / 398502 Next

once more: methods on missing data

On Thu, 7 Jun 2001 Maciej.Hoffman-Wecker at evotecoai.com wrote in part:
<snip>
Ideally, they shouldn't.  NA is missing data -- that is, we don't know the
value of the statistic because some data were not measured. That's why,
for example  NA & FALSE is FALSE, not NA, because the value of the
expression is known, no matter what the first operand is.

The results for min() and max() have the rationale that eg max(a,max(b))
should return the same as max(a,b) even when b is empty. There's even some
examples where this is genuinely helpful.

If the others were to return a value I think NaN (undefined numerical
result) would be better than NA (missing data), as is the case with
mean(). This would argue for changing the return value of quantile() as
well.

However, I think it's reasonable for a function to refuse to calculate the
variance of no data. We do have try() to handle errors if needed.

	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._