Skip to content

colSums etc. documentation (PR#2545)

2 messages · bert_gunter@merck.com, David Brahm

#
For your consideration:
[,1] [,2]
[1,]    1   NA
[2,]    2   NA
[3,]    3   NA
[1]  6 NA

Correct, according to the documentation
[1] 6 0

Surprising to me, but, as documented, correctly consistent with apply() and
[1] 0

The documentation for sum() explicitly notes that the sum of an empty set is
0 by definition, so that users will not get caught by this behavior (or at
least cannot complain about it if they are). I wonder if it might be wise to
include this note in the documentation for colSums, colMeans, etc. too, as
the current Help file says only:
"If there are no non-missing values in a range to be summed over, the
component of the output is set to NA."
This is obviously the case only when na.rm=F.
Cheers,
Bert Gunter
Biometrics Research
Merck & Company
PO Box 200, Rahway, NJ 07065-0900
Ph: (732) 594-7765    Fax: 594-1565

"The business of the statistician is to catalyze the scientific learning
process."    --  George E.P. Box







------------------------------------------------------------------------------
#
Bert Gunter <bert_gunter@merck.com> wrote:
Certainly correct (the empty set has sum 0 and product 1) and consistent with
other functions.  But you are not the first to mention being surprised by this!
A little more documentation might be helpful, how about:

  If na.rm=TRUE and there are no non-missing values in a range to be summed
  over, the resulting component of the output is 0, consistent with apply().

It seems to me that the na.rm=FALSE case obviously produces `NA', so there's no
need to document this explicitly.