-----Original Message-----
From: R-package-devel [mailto:r-package-devel-bounces at r-project.org] On
Behalf Of Paul Buerkner
Sent: March 30, 2017 6:54 AM
To: r-package-devel at r-project.org
Subject: [R-pkg-devel] Problem in stats::model.matrix when omitting two-
way interactions
Hi all,
recently I stumbled upen a problem in stats::model.matrix that I think is
worth reporting.
When I run:
dat <- data.frame(
y = rnorm(8),
x1 = factor(rep(0:1, each = 4)),
x2 = factor(rep(rep(0:1, each = 2), 2)),
x3 = factor(rep(0:1, 4))
)
stats::model.matrix(y ~ x1+x2+x3 + x1:x2:x3, dat)
I get a matrix with 12 columns, which are linearily dependent and thus not
identified in a linear model:
summary(lm(y ~ x1+x2+x3 + x1:x2:x3, dat))
Of course, there is usually no need for such a formula that ignores the two-
way interactions, but from my point of view, model.matrix should still return
only 8 columns (or less) in order to produce identified models.
I wonder if this is some sort of intendend behavior or just a side effect of the
way model.matrix handles factors.
Many thanks in advance.
Paul
[[alternative HTML version deleted]]