colSums etc. documentation (PR#2545)

Thu, Feb 13, 2003 7:10 AM #

For your consideration:

[,1] [,2]
[1,]    1   NA
[2,]    2   NA
[3,]    3   NA

[1]  6 NA

Correct, according to the documentation

[1] 6 0

Surprising to me, but, as documented, correctly consistent with apply() and

[1] 0

The documentation for sum() explicitly notes that the sum of an empty set is
0 by definition, so that users will not get caught by this behavior (or at
least cannot complain about it if they are). I wonder if it might be wise to
include this note in the documentation for colSums, colMeans, etc. too, as
the current Help file says only:
"If there are no non-missing values in a range to be summed over, the
component of the output is set to NA."
This is obviously the case only when na.rm=F.
Cheers,
Bert Gunter
Biometrics Research
Merck & Company
PO Box 200, Rahway, NJ 07065-0900
Ph: (732) 594-7765    Fax: 594-1565

"The business of the statistician is to catalyze the scientific learning
process."    --  George E.P. Box







------------------------------------------------------------------------------

David Brahm

Thu, Feb 13, 2003 7:45 AM #

Bert Gunter <bert_gunter@merck.com> wrote:

Certainly correct (the empty set has sum 0 and product 1) and consistent with
other functions.  But you are not the first to mention being surprised by this!
A little more documentation might be helpful, how about:

  If na.rm=TRUE and there are no non-missing values in a range to be summed
  over, the resulting component of the output is 0, consistent with apply().

It seems to me that the na.rm=FALSE case obviously produces `NA', so there's no
need to document this explicitly.

-- David Brahm (brahm@alum.mit.edu)