Using response variable in interaction as explanatory variable in glm crashes R
On Mon, Oct 09, 2017 at 03:52:43PM +0000, Martin Maechler wrote:
Jan van der Laan <rhelp at eoos.dds.nl>
on Fri, 6 Oct 2017 12:13:39 +0200 writes:
> It is actually model.matrix that crashes, not glm. Same
> crash occurs with e.g. lm.
> model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
> also crashes R.
Yes, segmentation fault. It only happens when these are *logical* variables, not, e.g., when transformed to integer. The C code in src/library/stats/src/model.c tries to eliminate occurances of the LHS of the formula from the RHS when building the model matrix and it does work fine in the integer case. Part of the culprit code may be this (from line 717), with the isLogical(.) which in our case, shifts the pointer by 1 in the call to firstfactor() : int adj = isLogical(var_i)?1:0; // avoid overflow of jstart * nn PR#15578 firstfactor(&rx[jstart * nn], n, jnext - jstart, REAL(contrast), nrows(contrast), ncols(contrast), INTEGER(var_i)+adj); then in firstfactor(), we see the segfault (when running R with '-d gdb') :
> model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
Program received signal SIGSEGV, Segmentation fault.
0x00007fffeafa76b5 in firstfactor (ncx=0, v=0x5c3b37c, ncc=1, nrc=2, c=0x5c90008,
nrx=8, x=0x5cbf150) at ../../../../../R/src/library/stats/src/model.c:252
252 else xj[i] = cj[v[i]-1];
Missing separate debuginfos, .................
(gdb) list
247 for (int j = 0; j < ncc; j++) {
248 xj = &x[j * (R_xlen_t)nrx];
249 cj = &c[j * (R_xlen_t)nrc];
250 for (int i = 0; i < nrx; i++)
251 if(v[i] == NA_INTEGER) xj[i] = NA_REAL;
252 else xj[i] = cj[v[i]-1];
253 }
254 }
255
and indeed in the debugger, i=7 and v[i] is "outside", v[]
being of length 7, hence indexed 0:6.
Dear Martin, I just wanted to thank you for providing details on your approach to debugging. Often I see bug fixes and I wonder "how the heck did they figure that out?" so I am very excited when I see details like these on the process (and not just the end result), so that I can learn. Best, Scott
Scott Kostyshak Assistant Professor of Economics University of Florida https://people.clas.ufl.edu/skostyshak/