Skip to content
Prev 29443 / 63424 Next

rowSums()/colSums() don't preserve the 'integer' storage mode

Hi Bill,
Bill.Venables at csiro.au wrote:
The cost seems rather small. For the "if" block that is tagged /* columns */
in the do_colsum() function (src/main/array.c), that would give something
like:

     if (OP == 0 || OP == 1) { /* columns */
         int ans_type = (type == REALSXP || OP == 1) ? REALSXP : INTSXP;
	int ina = type == INTSXP ? NA_INTEGER : NA_LOGICAL;
         PROTECT(ans = allocVector(ans_type, p));
         for (j = 0; j < p; j++) {
             double rsum = 0.0;
             int isum = 0;
             switch (type) {
             case REALSXP:
                 rx = REAL(x) + n*j;
                 for (i = cnt = 0; i < n; i++, rx++) {
                     if (!ISNAN(*rx)) {cnt++; rsum += *rx;}
                     else if (keepNA) {rsum = NA_REAL; break;}
                 }
                 if (OP == 1) {
                     if (cnt == 0 || ISNAN(rsum))
                         REAL(ans)[j] = NA_REAL;
                     else
                         REAL(ans)[j] = rsum / cnt;
                 } else
                     REAL(ans)[j] = rsum;
             break;
             case INTSXP: case LGLSXP:
                 ix = (type == INTSXP ? INTEGER(x) : LOGICAL(x)) + n*j;
                 for (i = cnt = 0; i < n; i++, ix++) {
                     if (*ix != ina) {cnt++; isum += *ix;}
                     else if (keepNA) {isum = NA_INTEGER; break;}
                 }
                 if (OP == 1) {
                     if (cnt == 0 || isum == NA_INTEGER)
                         REAL(ans)[j] = NA_REAL;
                     else
                         REAL(ans)[j] = ((double) isum) / cnt;
                 } else
                     INTEGER(ans)[j] = isum;
             break;
             default:
                 /* we checked the type above, but be sure */
                 UNIMPLEMENTED_TYPEt("do_colsum", type);
             }
         }
     }

So now you have 42 lines instead of 37 in the current code.
Then do something similar for the "if" block tagged /* rows */.
Basically the new code would not be more complicated than the
current code.

In the end rowSums()/colSums() will do the _right_ thing i.e.
they'll preserve the 'integer' storage mode, and, by doing so, will
behave consistently with the sum() function.

BTW since the LGLSXP type is supported then the "'x' must be numeric"
error msg at the beginning of the do_colsum() function is inappropriate.

Cheers,
H.