FWIW, I paste below a possible change to the warnings generating part of the do_matrix function in R/src/main/array.c that adds the kind of warning that Abby is asking for, and that IMHO would more often help users find bugs in their code than interfere with intended behaviour.
matrix (1:6, nrow = 2, ncol = 3)
matrix (1:12, nrow = 2, ncol = 3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Warning message:
In matrix(1:12, nrow = 2, ncol = 3) :
data length incompatible with size of matrix
matrix (1:7, nrow = 2, ncol = 3)
Warning messages:
1: In matrix(1:7, nrow = 2, ncol = 3) :
data length [7] is not a sub-multiple or multiple of the number of rows [2]
2: In matrix(1:7, nrow = 2, ncol = 3) :
data length incompatible with size of matrix
matrix (1:8, nrow = 2, ncol = 3)
Warning messages:
1: In matrix(1:8, nrow = 2, ncol = 3) :
data length [8] is not a sub-multiple or multiple of the number of columns [3]
2: In matrix(1:8, nrow = 2, ncol = 3) :
data length incompatible with size of matrix
matrix (1:6, nrow = 0, ncol = 0)
<0 x 0 matrix>
matrix (numeric(0), nrow = 2, ncol = 3)
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA NA NA
matrix(1:2, ncol = 8)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 2 1 2 1 2 1 2
It would be nice to combine the new warning with that about ?...not a sub-multiple or multiple?? into a single warning, if appropriate (as in two of the examples above), but that would require bigger surgery way above my payscale.
Kind regards
Wolfgang Huber
Index: array.c
===================================================================
--- array.c (revision 79951)
+++ array.c (working copy)
@@ -133,18 +133,19 @@
nc = (int) ceil((double) lendat / (double) nr);
}
- if(lendat > 0) {
+ if (lendat > 1) {
R_xlen_t nrc = (R_xlen_t) nr * nc;
- if (lendat > 1 && nrc % lendat != 0) {
+ if ((nrc % lendat) != 0) {
if (((lendat > nr) && (lendat / nr) * nr != lendat) ||
((lendat < nr) && (nr / lendat) * lendat != nr))
warning(_("data length [%d] is not a sub-multiple or multiple of the number of rows [%d]"), lendat, nr);
else if (((lendat > nc) && (lendat / nc) * nc != lendat) ||
((lendat < nc) && (nc / lendat) * lendat != nc))
- warning(_("data length [%d] is not a sub-multiple or multiple of the number of columns [%d]"), lendat, nc);
- }
- else if ((lendat > 1) && (nrc == 0)){
+ warning(_("data length [%d] is not a sub-multiple or multiple of the number of columns [%d]"), lendat, nc);
+ if (nrc == 0)
warning(_("data length exceeds size of matrix"));
+ if (nrc != lendat)
+ warning(_("data length incompatible with size of matrix"));
}
}
------
// And here, for easy checking that part of the code in the new form:
if (lendat > 1) {
R_xlen_t nrc = (R_xlen_t) nr * nc;
if ((nrc % lendat) != 0) {
if (((lendat > nr) && (lendat / nr) * nr != lendat) ||
((lendat < nr) && (nr / lendat) * lendat != nr))
warning(_("data length [%d] is not a sub-multiple or multiple of the number of rows [%d]"), lendat, nr);
else if (((lendat > nc) && (lendat / nc) * nc != lendat) ||
((lendat < nc) && (nc / lendat) * lendat != nc))
warning(_("data length [%d] is not a sub-multiple or multiple of the number of columns [%d]"), lendat, nc);
if (nrc == 0)
warning(_("data length exceeds size of matrix"));
if (nrc != lendat)
warning(_("data length incompatible with size of matrix"));
}
}
Il giorno 2feb2021, alle ore 00:27, Abby Spurdle (/??bi/) <spurdle.a at gmail.com> ha scritto:
So, does that mean that a clean result is contingent on the length of
the data being a multiple of both the number of rows and columns?
However, this rule is not straightforward.
#EXAMPLE 1
#what I would expect
matrix (1:12, 0, 0)
<0 x 0 matrix>
Warning message:
In matrix(1:12, 0, 0) : data length exceeds size of matrix
#EXAMPLE 2
#don't like this
matrix (numeric (), 2, 3)
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA NA NA
The first example is what I would expect, but is inconsistent with the
previous examples.
(Because zero is a valid multiple of twelve).
I dislike the second example with recycling of a zero-length vector.
This *is* covered in the help file, but also seems inconsistent with
the previous examples.
(Because two and three are not valid multiples of zero).
Also, I can't think of any reason why someone would want to construct
a matrix with extra data, and then discard part of it.
And even if there was, then why not allow an arbitrarily longer length?
On Mon, Feb 1, 2021 at 10:08 PM Martin Maechler
<maechler at stat.math.ethz.ch> wrote:
Abby Spurdle (/??bi/)
on Mon, 1 Feb 2021 19:50:32 +1300 writes:
I'm a little surprised that the following doesn't trigger an error or a warning.
matrix (1:256, 8, 8)
The help file says that the main argument is recycled, if it's too short.
But doesn't say what happens if it's too long.
It's somewhat subtler than one may assume :
matrix(1:9, 2,3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Warning message:
In matrix(1:9, 2, 3) :
data length [9] is not a sub-multiple or multiple of the number of rows [2]
matrix(1:8, 2,3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Warning message:
In matrix(1:8, 2, 3) :
data length [8] is not a sub-multiple or multiple of the number of columns [3]
matrix(1:12, 2,3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
So it looks to me the current behavior is quite on purpose.
Are you sure it's not documented at all when reading the docs
carefully? (I did *not*, just now).