Skip to content
Prev 54226 / 63424 Next

Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing

I think it is not a bug. It is a general property of interactions.
This property is best observed if all variables are factors
(qualitative).

For example, you have three variables (factors). You ask for as many
interactions as possible, except an interaction term between two
particular variables. When this interaction is not a constant, it is
different for different values of the remaining variable. More
precisely: for all values of that variable. In other words: you have a
three-way interaction, with all values of that variable.

An even smaller example is the following script with only two
variables, each being a factor:

 df <- expand.grid(X1=c("p","q"), X2=c("A","B","C"))
 print(model.matrix(~(X1+X2)^2    ,data=df))
 print(model.matrix(~(X1+X2)^2 -X1,data=df))
 print(model.matrix(~(X1+X2)^2 -X2,data=df))

The result is:

  (Intercept) X1q X2B X2C X1q:X2B X1q:X2C
1           1   0   0   0       0       0
2           1   1   0   0       0       0
3           1   0   1   0       0       0
4           1   1   1   0       1       0
5           1   0   0   1       0       0
6           1   1   0   1       0       1

  (Intercept) X2B X2C X1q:X2A X1q:X2B X1q:X2C
1           1   0   0       0       0       0
2           1   0   0       1       0       0
3           1   1   0       0       0       0
4           1   1   0       0       1       0
5           1   0   1       0       0       0
6           1   0   1       0       0       1

  (Intercept) X1q X1p:X2B X1q:X2B X1p:X2C X1q:X2C
1           1   0       0       0       0       0
2           1   1       0       0       0       0
3           1   0       1       0       0       0
4           1   1       0       1       0       0
5           1   0       0       0       1       0
6           1   1       0       0       0       1

Thus, in the second result, we have no main effect of X1. Instead, the
effect of X1 depends on the value of X2; either A or B or C. In fact,
this is a two-way interaction, including all three values of X2. In
the third result, we have no main effect of X2, The effect of X2
depends on the value of X1; either p or q.

A complicating element with your example seems to be that your X1 and
X2 are not factors.

   Arie
On Thu, Oct 12, 2017 at 7:12 PM, Tyler <tylermw at gmail.com> wrote: